Date Added: Apr 2012
Clustering high dimensional data has been a big issue for clustering algorithms because of the intrinsic sparsity of the data points. Several recent research results signifies that in case of high dimensional data, even the notion of proximity or clustering possibly will not be significant. K-Means is one of the basic clustering algorithms which is commonly used in several applications, but it is not possible to discover subspace clusters. The subspaces are explicit to the clusters themselves. In this paper, an algorithm called Modified Projected K-Means Clustering Algorithm with Effective Distance Measure is designed to generalize K-Means algorithm with the objective of managing the high dimensional data.