HybridMR: A Hierarchical MapReduce Scheduler for Hybrid Data Centers
Virtualized environments are attractive because they simplify cluster management, while facilitating cost-effective workload consolidation. As a result, virtual machines, either in public clouds or in private data centers, have become the norm for running many interactive applications such as web servers. On the other hand, batch workloads like MapReduce are typically deployed in a native cluster for avoiding the performance overheads of virtualization. While both these virtual and native environments have their own strengths, the authors believe it is feasible to provide the best of these two computing paradigms in a hybrid platform, and is the motivation of this paper.