The Anatomy of Mapreduce Jobs, Scheduling, and Performance Challenges

Download Now
Provided by: George Mason University
Topic: Storage
Format: PDF
Hadoop is a leading open source tool that supports the realization of the big data revolution and is based on Google's MapReduce pioneering work in the field of ultra large amount of data storage and processing. Instead of relying on expensive proprietary hardware, Hadoop clusters typically consist of hundreds or thousands of multi-core commodity machines. Instead of moving data to the processing nodes, Hadoop moves the code to the machines where the data reside, which is inherently more scalable.
Download Now

Find By Topic