Application of a Similarity Measure for Graphs to Web-Based Document Structures
Source: Technische Universitat Darmstadt
Due to the tremendous amount of information provided by the World Wide Web (WWW) developing methods for mining the structure of web-based documents is of considerable interest. In this paper the authors present a similarity measure for graphs representing web-based hypertext structures. The similarity measure is mainly based on a novel representation of a graph as linear integer strings, whose components represent structural properties of the graph. The similarity of two graphs is then defined as the optimal alignment of the underlying property strings.