Performance Analysis of K-Means With Different Initialization Methods for High Dimensional Data

Developing effective clustering method for high dimensional dataset is a challenging problem due to the curse of dimensionality. Among all the partition based clustering algorithms, k-means is one of the most well known methods to partition a dataset into groups of patterns. However, the k-means method converges to one of many local minima. And it is known that, the final result depends on the initial starting points (means). Many methods have been proposed to improve the performance of k-means algorithm. In this paper, the authors have analyzed the performance of the proposed method with the existing works. In the proposed method, they have used Principal Component Analysis (PCA) for dimension reduction and to find the initial centroid for k-means.

Provided by: Dr.N.G.P. Institute of Technology Topic: Big Data Date Added: Oct 2010 Format: PDF

Find By Topic