SCALE: A Scalable Framework for Efficiently Clustering Transactional Data

Download Now Free registration required

Executive Summary

This paper presents SCALE, a fully automated transactional clustering framework. The SCALE design highlights three unique features. First, the authors introduce the concept of Weighted Coverage Density as a categorical similarity measure for efficient clustering of transactional datasets. The concept of weighted coverage density is intuitive and it allows the weight of each item in a cluster to be changed dynamically according to the occurrences of items. Second, they develop the weighted coverage density measure based clustering algorithm, a fast, memory-efficient, and scalable clustering algorithm for analyzing transactional data. Third, they introduce two clustering validation metrics and show that these domain specific clustering evaluation metrics are critical to capture the transactional semantics in clustering analysis.

  • Format: PDF
  • Size: 275.4 KB