Investigation of Data Locality and Fairness in MapReduce

Download Now
Provided by: Association for Computing Machinery
Topic: Big Data
Format: PDF
In data-intensive computing, MapReduce is an important tool that allows users to process large amounts of data easily. Its data locality aware scheduling strategy exploits the locality of data accessing to minimize data movement and thus reduce network traffic. In this paper, the authors firstly analyze the state-of-the-art MapReduce scheduling algorithms and demonstrate that optimal scheduling is not guaranteed. After that, they mathematically reformulate the scheduling problem by using a cost matrix to capture the cost of data staging and propose an algorithm lsapsched that yields optimal data locality.
Download Now

Find By Topic