The International Journal of Innovative Research in Computer and Communication Engineering
Clustering was developed for the numeric, but as technology grows the digitalization of the text document increased so, searching of the text document becomes very complex. Document clustering involves the use of descriptors and descriptor extraction. Descriptors are sets of words that describe the contents within the cluster. Document clustering is generally considered to be a centralized process. Examples of document clustering include web document clustering for search users. The application of document clustering can be categorized to two types, online and offline.