Binary Information Press
Text clustering is one of the difficult and hot research fields in the internet search engine research. Using agglomerative hierarchical clustering techniques, a new text clustering algorithm is presented. Firstly, texts are preprocessed to satisfy succeed process. Then, the paper analyzes common K-means clustering algorithm and improves the algorithm through improving selection methods of initial cluster centers, K-means algorithm principle and K-means algorithm flow to improve the deficiency that the K-means algorithm is very sensitive to the initial cluster center and the isolated point text. The experimental results indicate that the improved algorithm has a higher accuracy compared with the original algorithm, and has a better stability.