An Efficient Feature Reduction Technique For Large Scale Text Data

Download Now Date Added: Mar 2012
Format: PDF

The text data dimension reduction plays an important role in the present day mining of text data and information retrieval. Dimensionality reduction algorithms are generally designed for feature extraction or feature selection. The authors' paper feature-selection algorithm can achieve the optimal solution according to its objective function. In this paper, they formulate the two algorithm categories through a unified optimization framework, under which they develop a novel feature selection algorithm called Trace Oriented Feature Analysis (TOFA). In this paper, the following four parameters, such as coverage factor, phrase frequency, conceptual frequency, GSO-feature selection. The coverage factor of the features is defined as the percentage of documents containing at least one feature of the features extracted.