An Efficient Technique for Clustering High Dimensional Data Set

In the modern world, advance technologies produce huge amount of data with many objects and dimensions. Traditional clustering algorithms do not perform well in the high dimensional data sets as similarity measures are no more meaningful, hence the data objects are equidistant from each other in high dimensions. Some traditional algorithms produce local optimum results as they start with random initial clusters centers. Relevant feature selection and selection of optimal initial clusters centers are two major issues of many partitioning clustering algorithms. In this paper, the authors propose a technique for selecting most relevant dimensions of data set and efficient initialization of clusters centers.

Provided by: Universidad Nacional de General Sarmiento Topic: Data Management Date Added: Jun 2011 Format: PDF

Find By Topic