Reduction of Data at Namenode in HDFS Using Harballing Technique

HDFS stands for the Hadoop Distributed File System. It has the property of handling large size files (in MB's, GB's or TB's). Scientific applications adapted this HDFS/Map-reduce for large scale data analytics. But major problem is small size files which are common in these applications. HDFS manages these entire small file through single Namenode server. Storing and processing these small size file in HDFS is overhead to map-reduce program and also have an impact on the performance on Namenode.

Provided by: International Journal of Advanced Research in Computer Engineering & Technology Topic: Software Date Added: Jun 2012 Format: PDF

Find By Topic