The International Journal of Innovative Research in Computer and Communication Engineering
Map Reduce is programming model and it processes large scale data in distributed applications due to its efficiency, simplicity and ease of use. Hadoop is an open source implementation of the Map Reduce programming model for cloud computing. Thought Hadoop is giving best performance but still it tackle number of issues to gain that performance. The issues faces by Hadoop are a serialization barrier that delays the reduce phase, repetitive merges and disk accesses; the lack of portability to different interconnects. In this paper, propose a distributed implementation of hierarchical merge using RDT algorithm using Map Reduce computing model and deploy it on a Hadoop cluster.