Analysis of Different Similarity Measure Functions and Their Impacts on Shared Nearest Neighbor Clustering Approach
Clustering is a technique of grouping data with analogous data content. In recent years, Density based clustering algorithms especially SNN clustering approach has gained high popularity in the field of data mining. It finds clusters of different size, density, and shape, in the presence of large amount of noise and outliers. SNN is widely used where large multidimensional and dynamic databases are maintained. A typical clustering technique utilizes similarity function for comparing various data items. Previously, many similarity functions such as Euclidean or Jaccard similarity measures have been worked upon for the comparison purpose.