Hadoop is a framework for distributed processing of large data set across the clusters of commodity computers using simple programming model. Google MapReduce and Hadoop MapReduce are two different implementations of MapReduce framework. MapReduce framework creates an intermediate data. After finishing the task such intermediate data cannot be easily accessible. Motivated by this observation, in this paper, the authors propose using the map reduce framework a data aware cache system for a big data application a cache layer for efficiently identifying and accessing cache items in a MapReduce job provides aims at extending the MapReduce framework.