Effective Feature Selection For High Dimensional Data using Fast Algorithm

Download Now
Provided by: International Journal of Advanced Research in Computer Science & Technology (IJARCST)
Topic: Data Management
Format: PDF
Feature subset clustering is a powerful technique to reduce the dimensionality of feature vectors for text classification. In this paper, the authors propose a similarity-based self-constructing algorithm for feature clustering with the help of K-Means strategy. The words in the feature vector of a document set are grouped into clusters, based on similarity test. Words that are similar to each other are grouped into the same cluster, and make a head to each cluster data sets. By the FAST algorithm, the derived membership functions match closely with and describe properly the real distribution of the training data.
Download Now

Find By Topic