The International Journal of Innovative Research in Computer and Communication Engineering
Data mining is the process of extracting the implicit, previously unknown and potentially useful information from data. Document clustering, subset of data clustering, is the technique of data mining which includes concepts from the fields of information retrieval, natural language processing, and machine learning. Document clustering organizes documents into different groups called as clusters, where the documents in each cluster share some common properties according to defined similarity measure. The fast and high quality document clustering algorithms play an important role in helping users to effectively navigate, summarize, and organize the information.