Implementation of Data Cleaning/Scrubbing in Data Ware House for Efficient Data Quality
In this paper, the authors explain about the optimized data present in data warehouse. A data warehouse can have data with several impurities such as duplicate data, incomplete data, unflustered data etc. In this paper, they have combined 3 approaches to resolve these impurities. These approaches include duplicate data detection and elimination; apply association rule to collect related data and a fuzzy approach for the data classification. The reliability of data is because of its accuracy. A data warehouse contains bulk of data. It includes the data taken from different data centers. Because of this data ware house can contain some data impurities.