On Repairing Structural Problems In Semistructured Data

Semi-structured data such as XML are popular for data interchange and storage. However, many XML documents have improper nesting where open- and close-tags are unmatched. Since some semi-structured data (e.g., Latex) have a flexible grammar and since many XML documents lack an accompanying DTD or XSD, the authors focus on computing a syntactic repair via the edit distance. To solve this problem, they propose a dynamic programming algorithm which takes cubic time. While this algorithm is not scalable, well-formed substrings of the data can be pruned to enable faster computation.

Provided by: University of Trás-os-Montes and Alto Douro Topic: Big Data Date Added: Aug 2013 Format: PDF

Find By Topic