Automation of Data Clusters Based on Layered HMM
One of the major problems in cluster analysis is the determination of the number of clusters in unlabeled data, which is a basic input for most clustering algorithms. Clustering is the process, where a given collection of unlabeled patterns (dataset), the data items are divided into groups (clusters) based on some measure of similarity. A variety of clustering techniques have been proposed in the machine learning, pattern recognition, data mining and statistics domains. Many clustering algorithms require the number of clusters 'C' as an input parameter, so the quality of the resulting clusters is largely dependent on the estimation of 'C'.