On the Design Space of MapReduce ROLLUP Aggregates

Provided by: Creative Commons
Topic: Data Management
Format: PDF
In this paper, the authors define and explore the design space of efficient algorithms to compute ROLLUP aggregates, using the MapReduce programming paradigm. Using a modeling approach, they explain the non-trivial trade-off that exists between parallelism and communication costs that is inherent to a MapReduce implementation of ROLLUP. Furthermore, they design a new family of algorithms that, through a single parameter, allow finding a "Sweet spot" in the parallelism vs. communication cost trade-off. They complement their work with an experimental approach, wherein they overcome some limitations of the model they use.

Find By Topic