DataSynth: Generating Synthetic Data using Declarative Constraints

A variety of scenarios such as database system and application testing, data masking, and benchmarking require synthetic database instances, often having complex data characteristics. The authors present DataSynth, a flexible tool for generating synthetic databases. DataSynth uses a simple and powerful declarative abstraction based on cardinality constraints to specify data characteristics, and uses sophisticated algorithms to efficiently generate database instances satisfying the specified characteristics. The paper show-case the various feature of DataSynth using two real-world data generation scenarios.

Subscribe to the Data Insider Newsletter

Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more. Delivered Mondays and Thursdays

Subscribe to the Data Insider Newsletter

Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more. Delivered Mondays and Thursdays

Resource Details

Provided by:
VLD Digital
Topic:
Data Management
Format:
PDF