Data Management

Lineage Processing Over Correlated Probabilistic Databases

Download Now Date Added: Jun 2010
Format: PDF

In this paper, the authors address the problem of scalably evaluating conjunctive queries over correlated probabilistic databases containing tuple or attribute uncertainties. Like previous work, the authors adopt a two-phase approach where the authors first compute lineages of the output tuples, and then compute the probabilities of the lineage formulas. However unlike previous work, the authors allow for arbitrary and complex correlations to be present in the data, captured via a forest of junction trees. They observe that evaluating even read-once (tree structured) lineages (e.g., those generated by hierarchical conjunctive queries), polynomially computable over tuple independent probabilistic databases, is #P-complete for lightly correlated probabilistic databases like Markov sequences.