The HaLoop Approach to Large-Scale Iterative Data Analysis
The growing demand for large-scale data mining and data analysis applications has led both industry and academia to design new types of highly scalable data-intensive computing platforms. MapReduce has enjoyed particular success. However, MapReduce lacks built-in support for iterative programs, which arise naturally in many applications including data mining, web ranking, graph analysis, and model fitting. This paper presents HaLoop, a modified version of the Hadoop MapReduce framework, that is designed to serve these applications.