K-Means Clustering with Careful Seeding For Large Cluster Number
Clustering is the unsupervised classification of patterns into groups so that the objects in the same cluster are more similar. There are many clustering algorithms based on cluster model such as connectivity based hierarchical algorithm, centroid based k-means algorithm, distribution based expectation-maximization algorithm and density based algorithm. Among these, K-means clustering is a popular and widely used clustering algorithm, which is used in numerous applications like pattern recognition, image analysis, information retrieval and bioinformatics. However, clustering is a time consuming task, particularly for large cluster number. In this paper, the authors show that real time k-means can be realized for large number of clusters up to 1024. The main issue in K-means is initialization of data points.