Partition Algorithms - A Study and Emergence of Mining Projected Clusters in High-Dimensional Dataset

Date Added: Jul 2011
Format: PDF

High-dimensional data has a major challenge due to the inherent sparsity of the points. Existing clustering algorithms is inefficient to the required similarity measure is computed between data points in the full-dimensional space. In this paper, a number of projected clustering algorithms have been analyzed. However, most of them encounter difficulties when clusters hide in subspaces with very low dimensionality. These challenges motivate effort to propose a reliable K-mediods partitional distance-based projected clustering algorithm. The proposed process is based on the K-Means and K-Mediods algorithm, with the computation of distance restricted to subsets of attributes where object values are dense.