Provided by: International Journal of Computer Applications
Topic: Data Management
A breakneck progress of computers and web makes it easier to collect and store large amount of information in the form of text; e.g., reviews, forum postings, blogs, web pages, news articles and email messages. In text mining, growing size of text datasets and high dimensionality associated with natural language is great challenge which makes it difficult to classify documents in various categories and sub-categories. This paper focuses on cluster based document classification technique so that data inside each cluster shares some common trait.