A Theoretical Study of Text Document Clustering

Provided by: International Journal of Computing Science and Information Technology (IJCSIT)
Topic: Big Data
Format: PDF
Clustering is to partition an unstructured set of objects into clusters. One often wants to group similar objects in same cluster and dissimilar in different clusters as far as feasibly possible. Clustering is a widely studied data mining problem in text domain. In this paper, the authors provide an understanding about applying clustering to text documents. It thoroughly discusses about document pre-processing, applications of text clustering, key methods for text clustering, their relative advantages and limitations. Besides this, they will also discuss recent advances in this area.

Find By Topic