A Comprehensive Survey on Centroid Selection Strategies for Distributed K-Means Clustering Algorithm

Provided by: International Journal of Computer Applications
Topic: Big Data
Format: PDF
Extremely large data sets often known as "Big data" are analyzed for interesting patterns, trends, and associations, especially those relating to human behavior and interactions. Extraction of meaningful and useful information needs to be done in parallel using advanced clustering algorithms. In this paper, effort has been made to tweak in changes to the existing K-means algorithm so as to work in parallel using MapReduce paradigm. K-means due to its gradient descent nature is highly sensitive to the initial placement of the cluster centers.

Find By Topic