Limitations of Hadoop

Limitations of Hadoop

• Hadoop Map-reduce and HDFS are under active development.

• Programming model is very restrictive:- Lack of central data can be preventive.

• Joins of multiple datasets are tricky and slow:- No indices! Often entire dataset gets copied in the process.

• Cluster management is hard:- In the cluster, operations like debugging, distributing software, collection logs etc are too hard.

• Still single master which requires care and may limit scaling

• Managing job flow isn’t trivial when intermediate data should be kept

• Optimal configuration of nodes not obvious. Eg: – #mappers, #reducers, mem.limits

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s