Engineering Research Publication
The Big-data refers to the large-scale distributed data processing applications that operate on unusually huge amounts of data. Google's Map Reduce and Apache's Map Reduce, its open-source implementation, are the defector software systems for large scale data applications. Study of the Map Reduce framework is that the framework produces a large amount of intermediate data. Such existing information is thrown away after the tasks finish, because Map Reduce is not able to utilize them. In this paper, they propose, a data-aware cache framework for large data applications.