Integrating XML Data Sources Using Approximate Joins

Source: Association for Computing Machinery

Favorite

Free registration required

XML is widely recognized as the data interchange standard for tomorrow because of its ability to represent data from a wide variety of sources Hence XML is likely to be the format through which data from multiple sources is integrated this paper studies the problem of integrating XML data sources through correlations realized as join operations A challenging aspect of this operation is the XML document structure Two documents might convey approximately or exactly the same information but may be quite different in structure Consequently approximate match in structure in addition to content has to be folded in the joint operation. This paper quantifies approximate match in structure and content for pairs of XML documents using well defined notions of distance.
Format:PDF Size:943.00
Date:May 2008