ETL and Data Quality: Which Comes First?
Source: ThotWave Technologies
One of the primary tasks in any data warehousing project is a detailed examination of the source systems, including an audit of data quality. Data quality issues comprise of inconsistent data representation, missing data and difficulty around understanding relationships between the various source systems. As ETL and Data Quality technologies converge, it's important to use the right tools at the right time to fully tap the strength of each individual tool. Within SAS Data Integration Server, there are several opportunities to address data quality issues. A course of action will be established with suggested roles of stakeholders that can have an input on the ETL process outside of the direct development team. Data quality is a vital factor in the development of a data warehouse ETL process as the nature of results will directly impact the quality and value of the data warehouse. Business knowledge acquired from all levels of the business makes involvement of stake holders at every step of the process invaluable. Besides the technical workings of creating an ETL process, business will dictate important steps that the ETL with regards to data quality will require. The technical workings of ETL including the seamless use of Data Quality functions and procedures along with the attention to a business process makes development in SAS a true data integration process.