International Journal of Computing Science and Information Technology (IJCSIT)
Cloud computing is emerging as a new computational paradigm shift. Hadoop MapReduce has become a powerful computation model for processing large data on distributed commodity hardware clusters such as clouds. MapReduce is a programming model developed for large-scale analysis. It takes advantage of the parallel processing capabilities of a cluster in order to quickly process very large datasets in a fault-tolerant and scalable manner. The core idea behind MapReduce is mapping the data into a collection of key/value pairs, and then reducing overall pairs with the same key.