Probabilistic Databases with MarkoViews
Most of the work on query evaluation in probabilistic databases has focused on the simple tuple-independent data model, where tuples are independent random events. Several efficient query evaluation techniques exists in this setting, such as safe plans, algorithms based on OBDDs, treedecomposition and a variety of approximation algorithms. However, complex data analytics tasks often require complex correlations, and query evaluation then is significantly more expensive, or more restrictive. In this paper, the authors propose MVDB as a framework both for representing complex correlations and for efficient query evaluation. An MVDB specifies correlations by views, called MarkoViews, on the probabilistic relations and declaring the weights of the view's outputs. An MVDB is a (very large) Markov Logic Network.