A Survey on Tree Edit Distance Lower Bound Estimation Techniques for Similarity Join on XML Data

Download Now
Provided by: Harbin Institute of Technology
Topic: Big Data
Format: PDF
When integrating tree-structured data from autonomous and heterogeneous sources, exact joins often fail for the same object may be represented differently. Approximate join techniques are often used, in which similar trees are considered describing the same real-world object. A commonly accepted metric to evaluate tree similarity is the tree edits distance. While yielding good results, this metric is computationally complex, thus has limited benefit for large databases. To make the join process efficient, many previous works take filtering and refinement mechanisms. They provide lower bounds for the tree edit distance in order to reduce unnecessary calculations.
Download Now

Find By Topic