Using Statistics for Computing Joins with MapReduce

Provided by: RWTH Aachen University
Topic: Enterprise Software
Format: PDF
The MapReduce model has been designed to cope with ever-growing amounts of data. It has been successfully applied to various computational problems. In recent years, multiple MapReduce algorithms have also been developed for computing joins - one of the fundamental problems in managing and querying data. The main optimization goals of these algorithms for distributing the computation tasks to the available reducers are the replication rate and the maximum load of the reducers. The HyperCube algorithm of the researchers minimizes the former by considering only the size of the involved tables.

Find By Topic