Parallelizing Large-Scale Data Processing Applications with Data Skew: A Case Study in Product-Offer Matching

Provided by: Association for Computing Machinery
Topic: Storage
Format: PDF
The last decade has seen a surge of interest in large-scale data-parallel processing engines. While these engines share many features in common with parallel databases, they make a set of different trade-o s. In consequence many of the lessons learned for programming parallel databases have to be re-learned in the new environment. In this paper, the authors show a case study of parallelizing an example large-scale application (offer matching, a core part of online shopping) on an example MapReduce-based distributed computation engine.

Find By Topic