An Analysis of Projection Based Multiplicative Data Perturbation for K-Means Clustering
A random projection is a very simple yet powerful technique for dimensionality reduction. In this method the data is projected on to a random subspace, which preserves the approximate Euclidean distances between all pairs of points after the projection. It can be proved that the inner product and Euclidean distance are preserved in the new data in the expectation. And many important data mining algorithms (e.g., k-means clustering, KNN classification etc.) can be efficiently applied to the transformed data and produce expected result.