Arnold: Declarative Crowd-Machine Data Integration

Download Now
Provided by: Creative Commons
Topic: Big Data
Format: PDF
The availability of rich data from sources such as the World Wide Web, social media, and sensor streams is giving rise to a range of applications that rely on a clean, consistent, and integrated database built over these sources. Human input, or crowd-sourcing, is an effective tool to help produce such high-quality data. It is infeasible, however, to involve humans at every step of the data cleaning process for all data. The authors have developed a declarative approach to data cleaning and integration that balances when and where to apply crowd-sourcing and machine computation using a new type of data independence that they term labor independence.
Download Now

Find By Topic