Efficient Identification of Subspaces with Small but Substantive Clusters in Noisy Datasets

Provided by: RWTH Aachen University
Topic: Data Management
Format: PDF
In this paper, the authors propose an efficient filter approach (called ROSMULD) to rank subspaces with respect to their clustering tendency, that is, how likely it is to find areas in the respective subspaces with a (possibly slight but substantive) increase in density. Each data object votes for the subspace with the most unlikely high data density and subspaces are ranked according to the number of received votes. Data objects are allowed to vote only if the density significantly exceeds the density expected from the univariate distributions.

Find By Topic