Data Management

Estimating the Number of Frequent Itemsets in a Large Database

Free registration required

Executive Summary

Estimating the number of frequent itemsets for minimal support ? in a large dataset is of great interest from both theoretical and practical perspectives. However, finding not only the number of frequent itemsets, but even the number of maximal frequent itemsets, is #P-complete. In this paper, the authors provide a theoretical investigation on the sampling estimator. They discover and prove several fundamental but also rather surprising properties of the sampling estimator. They also propose a novel algorithm to estimate the number of frequent itemsets without using sampling.

  • Format: PDF
  • Size: 276.3 KB