International Journal of Computer Applications
Data quality is critical to the quality of patterns and analysis obtained from data. One of the important factors plaguing data is violation of Semantic Integrity, leading to inconsistency, in turn resulting in generation of bad patterns or reports when data mining or warehousing techniques are applied on such data. In this paper, a data quality mining technique is proposed to automatically generate semantic integrity constraint rules from the data. Further, this process leads to identification of outliers, which are then to be classified as either violations or genuine cases of exception.