International Journal of Engineering and Advanced Technology (IJEAT)
In this paper, the authors discuss the opportunities and challenges for efficient parallel data processing in clouds and present their research project Nephele. Nephele is the first data processing framework to explicitly exploit the dynamic resource allocation offered by today's IaaS clouds for both, task scheduling and execution. In this paper, they discuss the opportunities and challenges for efficient parallel data processing Particular tasks of a processing job can be assigned to different types of virtual machines which are automatically instantiated and terminated during the job execution. Based on this new framework, they perform extended evaluations of MapReduce-inspired processing jobs on an IaaS cloud system and compare the results to the popular data processing framework Hadoop.