Categorical Proportional Difference: A Feature Selection Method for Text Categorization

Supervised text categorization is a machine learning task where a predefined category label is automatically assigned to a previously unlabeled document based upon characteristics of the words contained in the document. Since the number of unique words in a learning task (i.e., the number of features) can be very large, the efficiency and accuracy of the learning task can be increased by using feature selection methods to extract from a document a subset of the features that are considered most relevant.

Provided by: Australian Computer Society Topic: Data Management Date Added: Sep 2008 Format: PDF

Find By Topic