Enhancing MapReduce Via Asynchronous Data Processing
Source: Virginia Tech
The MapReduce programming model simplifies large-scale data processing on commodity clusters by having users specify a map function that processes input key/value pairs to generate intermediate key/value pairs, and a reduce function that merges and converts intermediate key/value pairs into final results. Typical MapReduce implementations such as Hadoop enforce barrier synchronization between the map and reduce phases, i.e., the reduce phase does not start until all map tasks are finished. In turn, this synchronization requirement can cause inefficient utilization of computing resources and can adversely impact performance.
| Format: | Size: | 146.40 | |
| Date: | Nov 2010 |



