Big Data

Evaluation of Time Complexity Based on Max Average Distance for K-Means Clustering

Date Added: Apr 2012
Format: PDF

Clustering method is used in diverse scopes, namely, information retrieve, communication security system, data mining, etc. It is divided into hierarchical clustering, partitioning clustering, and more. K-means is one of partitioning clustering. The authors improve performance of K-means to select initial centers of cluster through calculating rather than random selecting. This method maximizes the distance among initial centers of clusters. After that, the centers are distributed evenly and that result is more accurate than initial cluster centers selected at random. It is time-consuming, but can reduce total clustering time by minimizing the number of allocation and recalculation. They can reduce the time spent on total clustering.