Download now Free registration required
MapReduce has become an important distributed processing model for large-scale data-intensive applications like data mining and web indexing. Hadoop - an open-source implementation of MapReduce is widely used for short jobs requiring low response time. The current Hadoop implementation assumes that computing nodes in a cluster are homogeneous in nature. Data locality has not been taken into account for launching speculative map tasks, because it is assumed that most maps are data-local. Unfortunately, both the homogeneity and data locality assumptions are not satisfied in virtualized data centers.
- Format: PDF
- Size: 116.3 KB