International Journal of Engineering Technology, Management and Applied Sciences (IJETMAS)
In recent years, massive amount of data is produced by numerous sources around the globe. All this data is difficult to analyze using existing data processing and analysis tools. All this data is known as big data. To handle big data problem several new computing platforms and tools are introduced by researchers. Hadoop is one of those tools to process and analyze big data. It becomes de-facto standard platform for big-data in industry as well as academia. Hadoop is composed of two major components Hadoop Distributed File System (HDFS) and MapReduce.