Cura: A Cost-optimized Model for MapReduce in a Cloud
The authors propose a new MapReduce cloud service model, Cura, for data analytics in the cloud. They argue that performing MapReduce analytics in existing cloud service models-either using a generic compute cloud or a dedicated MapReduce cloud-is inadequate and inefficient for production workloads. Existing services require users to select a number of complex cluster and job parameters while simultaneously forcing the cloud provider to use those potentially sub-optimal configurations resulting in poor resource utilization and higher cost. In contrast Cura leverages MapReduce profiling to automatically create the best cluster configuration for the jobs so as to obtain a global resource optimization from the provider perspective.