Date Added: Apr 2010
MapReduce is a distributed computing paradigm that is being widely used for building large-scale data processing applications like content indexing, data mining and log file analysis. Offered in the cloud, users can construct their own virtualized MapReduce clusters using Virtual Machines (VMs) managed by the cloud service provider. However, to maintain low costs for such cloud services, cloud operators are required to optimize the energy consumption of these applications. This paper describes a unique spatio-temporal tradeoff for achieving energy efficiency for MapReduce jobs in such virtualized environments.