International Journal of Science and Research (IJSR)
Now-a-days most of the cloud applications process large amount of data to provide the desired results. Data volumes to be processed by cloud applications are growing much faster than computing power. This growth demands on new strategies for processing and analyzing the information. This paper explores the use of Hadoop MapReduce framework to execute scientific workflows in the cloud. Cloud computing provides massive clusters for efficient large computation and data analysis. In such file systems, a file is partitioned into a number of file chunks allocated in distinct nodes so that MapReduce tasks can perform in parallel over the nodes to make resource utilization effective and to improve the response time of the job.