Date Added: Mar 2011
This paper studies top-k query evaluation for an important class of probabilistic semi-structured data: nested DAGs (Directed Acyclic Graphs) that describe possible execution flows of Business Processes (BPs for short). The authors consider queries with projection, that select portions (sub-flows) of the execution flows that interest the user and are most likely to occur at run-time. Retrieving common sub-flows is crucial for various applications such as targeted advertisement and BP optimization. Sub-flows are ranked here by the sum of likelihood of EX-flows in which they appear, in contrast to the max-of-likelihood semantics studied in previous work; they show that while sum semantics is more natural, it makes query evaluation much more challenging.