Deriving Effectiveness Measures for Data Quality Rules

The poor quality of data constitutes a major concern world-wide, and an obstacle to data integration and analysis efforts. Detecting errors and inconsistencies using application specific data quality rules play an important role in data quality assessment. These rules have different efficacy and cost under different circumstances. In the authors' previous paper, they have proposed a quantitative framework for measuring and comparing data quality rules in terms of their effectiveness. Effectiveness formulas are built from variables that represent probabilistic assumptions about the occurrence of errors in data values, and their earlier paper gave examples of how to derive these formulas in an ad-hoc fashion.

