Iterative MapReduce for Azure Cloud
MapReduce distributed data processing architecture has become the de-facto data-intensive analysis mechanism in compute clouds and in commodity clusters, mainly due to its excellent fault tolerance features, scalability, ease of use and the simpler programming model. MapReduceRoles for Azure (MR4Azure) is a decentralized, dynamically scalable MapReduce runtime the authors developed for Windows Azure Cloud platform using Microsoft Azure cloud infrastructure services as the building blocks. This paper presents Twister4Azure, which adds support for optimized iterative MapReduce computations to MR4Azure, based on the concepts of Twister Iterative MapReduce framework. Twister4Azure enables a wide array of large scale iterative data analysis and scientific applications to utilize Azure platform easily and efficiently, while preserving the fault tolerance, decentralized and dynamic scheduling features of MR4Azure.