An Improved Algorithm of Bayesian Text Categorization
Source: Academy Publisher
Text categorization is a fundamental methodology of text mining and a hot topic of the research of data mining and web mining in recent years. It plays an important role in building traditional information retrieval, web indexing architecture, Web information retrieval, and so on. This paper presents an improved algorithm of text categorization that combines the feature weighting technique with Naive Bayesian classifier. Experimental results show that using the improved Gini index algorithm to feature weight can improve the performance of Naive Bayesian classifier effectively. This algorithm obtains good application in the sensitive information recognition system.