Date Added: May 2012
Clustering, in data mining is useful for discovering groups and identifying interesting distributions in the underlying data and to fast query processing. Traditional clustering algorithms favor clusters with either spherical shapes or with similar sizes. The traditional algorithms are very fragile in the presence of outliers. The authors propose clustering algorithm called CURE which provides features to clusters like robust to outliers, identifies clusters having non-spherical shapes and wide variances in size. The new algorithm achieves this by representing each cluster by a certain fixed number of points that are generated by selecting well scattered points from the cluster and then shrinking them to the center of the cluster by a fraction.