A Uniform Dependency Language for Improving Data Quality
A variety of dependency formalisms have been studied for improving data quality. To treat these dependencies in a uniform framework, the authors propose a simple language, Quality Improving Dependencies (QIDs). They show that previous dependencies considered for data quality can be naturally expressed as QIDs, and that different enforcement mechanisms of QIDs yield various data repairing strategies. Data quality has been a longstanding line of research for decades, and has become one of the most pressing challenges for data management. As an example, it is estimated that dirty data costs US companies alone 600 billion dollars each year. With this comes the need for studying techniques for improving data quality.