Improved Cluster Partition in Principal Component Analysis Guided Clustering

Provided by: International Journal of Computer Applications
Topic: Data Management
Format: PDF
Principal Component Analysis (PCA) guided clustering approach is widely used in high dimensional data to improve the efficiency of K- means cluster solutions. Typically, Pearson correlation is used in PCA to provide an eigen-analysis to obtain the associated components that account for most of the variations in the data. However, PCA based Pearson correlation can be sensitive on non-Gaussian distributed data, which involve skewed observations such as outlying values. Thus, applying PCA based Pearson correlation on such data could affect cluster partitions and generate extremely imbalanced clusters in a high dimensional space.

Find By Topic