A Cluster Based Approach with N-Grams at Word Level for Document Classification

A breakneck progress of computers and web makes it easier to collect and store large amount of information in the form of text; e.g., reviews, forum postings, blogs, web pages, news articles and email messages. In text mining, growing size of text datasets and high dimensionality associated with natural language is great challenge which makes it difficult to classify documents in various categories and sub-categories. This paper focuses on cluster based document classification technique so that data inside each cluster shares some common trait.

Provided by: International Journal of Computer Applications Topic: Data Management Date Added: May 2015 Format: PDF

Find By Topic