PARMA: A Parallel Randomized Algorithm for Approximate Association Rules Mining in MapReduce

Download Now
Provided by: Brown University
Topic: Big Data
Format: PDF
The authors present a novel randomized parallel technique for mining frequent itemsets and association rules. Their mining algorithm, PARMA, achieves near-linear speedup while avoiding costly replication of data. PARMA does this by creating multiple small random samples of the transactional dataset and running a mining algorithm on the samples independently and in parallel. The resulting collections of frequent itemsets or association rules from each sample are aggregated and filtered to provide a single collection in output. Because PARMA mines random subsets of the dataset, the final result is an approximation of the exact solution.
Download Now

Find By Topic