Data Curation at Scale: The Data Tamer System

Download Now
Provided by: Creative Commons
Topic: Big Data
Format: PDF
Data curation is the act of discovering a data source(s) of interest, cleaning and transforming the new data, semantically integrating it with other local data sources, and deduplicating the resulting composite. There has been much research on the various components of curation (especially data integration and deduplication). However, there has been little work on collecting all of the curation components into an integrated end-to-end system. In addition, most of the previous work will not scale to the sizes of problems that the authors are finding in the field.
Download Now

Find By Topic