An Iterative Improvement Procedure for Hierarchical Clustering
The authors describe a procedure which finds a hierarchical clustering by hill-climbing. The cost function they use is a hierarchical extension of the k-means cost; the local moves are tree restructurings and node re-orderings. They show these can be accomplished efficiently, by exploiting special properties of squared Euclidean distances and by using techniques from scheduling algorithms. A hierarchical clustering of n data points is a recursive partitioning of the data into 2, 3,4,? and finally n clusters. Each intermediate clustering is made more fine-grained by splitting one of its clusters.