International Journal for Development of Computer Science & Technology (IJDCST)
Measuring similarity or distance between two entities is a key step for several data mining and knowledge discovery tasks. The notion of similarity for continuous data is relatively well-understood, but for categorical data, the similarity computation is not straightforward. Several data-driven similarity measures have been proposed, the existing algorithms for text mining make use of a single viewpoint for measuring similarity between objects. Their drawback is that the clusters can't exhibit the complete set of relationships among objects. To overcome this drawback, the authors propose a new similarity measure known as hierarchical multi-viewpoint based similarity measure to ensure the clusters show all relationships among objects.