Date Added: Nov 2012
An effective technique to process and analyze large amounts of data is achieved through using the MapReduce framework. It is a programming model which is used to rapidly process vast amount of data in parallel and distributed mode operating on a large cluster of machines. Hadoop, an open-source implementation, is an example of MapReduce for writing and running MapReduce applications. The problem is to specify, which computing environment improves the performance of MapReduce to process large amounts of data? A standalone and cloud computing implementation are used for the experiment to evaluate whether the performance of running MapReduce system in cloud computing mode is better than in stand-alone mode or not, with respect to the speed of processing, response time and cost efficiency.