Business Intelligence

Evaluating Clustering in Subspace Projections of High Dimensional Data

Free registration required

Executive Summary

Knowledge discovery in databases provides database owners with new information about patterns in their data. Clustering is a traditional data mining task for automatic grouping of objects. Cluster detection is based on similarity between objects, typically measured with respect to distance functions. In high dimensional spaces, effects attributed to the "Curse of dimensionality" is known to break traditional clustering algorithms. Meaningful clusters cannot be detected as distances are increasingly similar for growing dimensionality.

  • Format: PDF
  • Size: 1636.2 KB