International Journal of Engineering Research and Development (IJERD)
Clustering has been recognized as the unsupervised classification of data items into groups. Due to the explosion in the number of autonomous data sources, there is an emergent need for effective approaches in distributed clustering. The distributed clustering algorithm is used to cluster the distributed datasets without gathering all the data in a single site. The K-means is a popular clustering method owing to its simplicity and speed in clustering large datasets. But, it fails to handle directly the datasets with categorical attributes which are generally occurred in real life datasets.