Business Intelligence

An Efficient Implementation of Apriori Algorithm Based on Hadoop-Mapreduce Model

Free registration required

Executive Summary

Finding frequent itemsets is one of the most important fields of data mining. Apriori algorithm is the most established algorithm for finding frequent itemsets from a transactional dataset; however, it needs to scan the dataset many times and to generate many candidate itemsets. In this paper, the authors have implemented an efficient MapReduce Apriori algorithm (MRApriori) based on Hadoop-MapReduce model which needs only two phases (MapReduce Jobs) to find all frequent k-itemsets, and compared their proposed MRApriori algorithm with current two existed algorithms which need either one or k phases (k is maximum length of frequent itemsets) to find the same frequent k-itemsets. Experimental results showed that the proposed MRApriori algorithm outperforms the other two algorithms.

  • Format: PDF
  • Size: 282.25 KB