Data Management

Enhanced Hierarchical Clustering for Genome Databases

Download Now Free registration required

Executive Summary

Clustering techniques find interesting and previously unknown patterns in large scale data embedded in a large multi dimensional space and are applied to a wide variety of problems like customer segmentation, Biology, data mining techniques, machine Learning and geographical information systems. Clustering algorithms are used efficiently to scale up with the dimensionality of the data sets and the data base size. Hierarchical clustering methods in particular are widely used to find patterns in multi dimensional data. In this paper, the authors design an enhanced hierarchical clustering algorithm which scans the dataset and calculates distance matrix only once. Their main contribution is to reduce time, even when a large database is analyzed.

  • Format: PDF
  • Size: 479.77 KB