An Effective and Efficient Algorithm for Document Clustering

Provided by: International Journal of Advanced Research in Computer Science and Software Engineering (IJARCSSE)
Topic: Data Management
Format: PDF
In this paper, the authors propose an effective and efficient algorithm for clustering text documents. This algorithm is formulated by using the concept of well-known k-means algorithm. The standard k-means algorithm suffers from the problem of random initialization of initial cluster centers. The proposed algorithm eliminates this problem by introducing a new approach for selection of initial cluster centroids. Several experiments are conducted on mini-news group dataset to measure the performance of proposed algorithm and the results obtained are very promising when compared to two other algorithms: k-means and enhanced k-means.

Find By Topic