Map Task Scheduling in MapReduce with Data Locality: Throughput and Heavy-Traffic Optimality

Download Now
Provided by: Arizona State University
Topic: Big Data
Format: PDF
Scheduling map tasks to improve data locality is crucial to the performance of MapReduce. Many works have been devoted to increasing data locality for better efficiency. However, to the best of the authors' knowledge, fundamental limits of MapReduce computing clusters with data locality, including the capacity region and theoretical bounds on the delay performance, have not been studied. In this paper, they address these problems from a stochastic network perspective. Their focus is to strike the right balance between data-locality and load-balancing to simultaneously maximize throughput and minimize delay. They present a new queueing architecture and propose a map task scheduling algorithm constituted by the Join the Shortest Queue policy together with the MaxWeight policy.
Download Now

Find By Topic