Jigsaw: Efficient Optimization Over Uncertain Enterprise Data
Probabilistic databases, in particular ones that allow users to externally define models or probability distributions - so called VG-Functions - are an ideal tool for constructing, simulating and analyzing hypothetical business scenarios. Enterprises often use such tools with parameterized models and need to explore a large parameter space in order to discover parameter values that optimize for a given goal. Parameter space is usually very large, making such exploration extremely expensive. The authors present Jigsaw, a probabilistic database-based simulation framework that addresses this performance problem. In Jigsaw, users define what-if style scenarios as parameterized probabilistic database queries and identify parameter values that achieve desired properties. Jigsaw uses a novel "Fingerprinting" technique that efficiently identifies correlations between a query's output distributions for different parameter values.