A New Text Mining Approach Based on HMM-SVM for Web News Classification

Date Added: Aug 2010
Format: PDF

Since the emergence of WWW, it is essential to handle a very large amount of electronic data of which the majority is in the form of text. This scenario can be effectively handled by various Data Mining techniques. This paper proposes an intelligent system for online news classification based on Hidden Markov Model (HMM) and Support Vector Machine (SVM). An intelligent system is designed to extract the keywords from the online news paper content and classify it according to the pre defined categories. Three different stages are designed to classify the content of online newspapers such as Text pre-processing, HMM based Feature Extraction and Classification using SVM.