Engineering Research Publication
Clustering is an approach that organizes a large quantity of unstructured text documents into a small number of meaningful and coherent clusters. The authors compare and analyze the effectiveness of these measures in partition clustering for text document datasets. Clustering means extraction and fast information retrieval or filtering. Related to document clustering, clustering methods can be used to automatically group the retrieved documents into a list of useful categories. Document clustering contains descriptors and descriptor retrieval. Descriptors are collection of words that describe the contents of the cluster. Document cluster is generally considered to be a centralized process.