In this big data world the people find daily based data which are generated day-by-day e.g. google, amazon, Facebook etc. This large amount of data is un-reliable to store, manage and analyze. This large volume of data runs on commodity hardware, which is parallel arranged. The data that are large in volume takes more time to execute for particular process and causes fail to distributed system because these huge volume data runs on distributed system. To overcome this issue a framework was proposed called Hadoop for big data processing and is being used now-a-days in organizations.