A Two-Stage Algorithm for Data Clustering
Cluster analysis is an important and popular data analysis technique that is used in a large variety of fields. K-means is a well-known and widely used clustering technique due to its simplicity in implementation and high speed in most situations. However, it suffers from two major shortcomings: it is very sensitive to the initial state of centroids and may converge to local optimum solution. In order to overcome the shortcomings of K-means, the authors present a two-stage approach, called KM-HS, which is based on K-means and a heuristic search algorithm.