An Algorithm for In-Core Frequent Itemset Mining on Streaming Data

Frequent itemset mining is a core data mining operation and has been extensively studied over the last decade. This paper takes a new approach for this problem and makes two major contributions. First, the authors present a one pass algorithm for frequent itemset mining, which has deterministic bounds on the accuracy, and does not require any out-of-core summary structure. Second, because the authors' one pass algorithm does not produce any false negatives, it can be easily extended to a two pass accurate algorithm. Their two pass algorithm is very memory efficient, and allows mining of datasets with large number of distinct items and/or very low support levels.

Provided by: Ohio State University Topic: Big Data Date Added: Jan 2011 Format: PDF

Find By Topic