On the Performance and Energy Efficiency of Hadoop Deployment Models
The exponential growth of scientific and business data has resulted in the evolution of the cloud computing and the MapReduce parallel programming model. Cloud computing emphasizes increased utilization and power savings through consolidation while MapReduce enables large scale data analysis. The Hadoop framework has recently evolved to the standard framework implementing the MapReduce model. In this paper, the authors evaluate Hadoop performance in both the traditional model of collocated data and compute services as well as consider the impact of separating out the services.