Processing XML Streams With Deterministic Automata

Source: University of Washington

Favorite

Free registration required

This paper considers the problem of evaluating a large number of XPath expressions on an XML stream. The main contribution consists in showing that Deterministic Finite Automata (DFA) can be used effectively for this problem: in the experiments the author achieves a throughput of about 5.4MB/s, independent of the number of XPath expressions (up to 1,000,000 in the tests). The major problem the paper faces is that of the size of the DFA. Since the number of states grows exponentially with the number of XPath expressions, it was previously believed that DFAs cannot be used to process large sets of expressions.
Format:PDF Size:450.10
Date:Jan 2008