National Technical University of Athens
Scheduling in MapReduce environments has become increasingly important during the last years, as MapReduce has been established as the standard programming model to implement massive parallelism in large data centers. Applications of MapReduce such as search indexing, web analytics and data mining, involve the concurrent execution of several MapReduce jobs on a system like Google's MapReduce or Apache Hadoop. When a MapReduce job is executed, a number of MapReduce tasks are created. Each Map task operates on a portion of the input elements, translating them into a number of key-value pairs.