Improving the Accuracy and Efficiency of K-Means Algorithm by Using Less Similarity Based Clustering Technique for Better Initial Centroids
Data mining is a field at the intersection of computer science and statistics, is the process that involves, introducing patterns in large data sets. It utilizes methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure. In data mining K-means clustering algorithm is one of the efficient unsupervised learning algorithms to solve the well known clustering problems. The disadvantage in k-means algorithm is that, the accuracy and efficiency is varied with the choice of initial clustering centers on choosing it randomly.