International Journal of Advanced Research in Computer Science and Software Engineering (IJARCSSE)
Clustering of data sets containing large number of variables presents challenges like cluster visualization and interpretation, computational complexity etc. Also when the scattering of data sets increase, distance measures used for assigning data sets to different clusters becomes less meaningful, leading to inaccurate clustering. So theoretically, it makes sense to reduce dimensionality of any large dimensionality data set before applying clustering techniques to it. Another area of challenge in data clustering is validation of clustering results for which a number of validity indices have been proposed.