Managing ETL Processes
ETL tools allow the definition of sometimes complex processes to extract, transform, and load heterogeneous data into a data warehouse or to perform other data migration tasks. In larger organizations many ETL processes of different data integration projects are accumulated. Such processes can encompass common sub-processes, shared data sources and targets, and same or similar operations. However, there is no common method or approach to systematically manage such ETL processes. The paper proposes the high-level management of such processes as a generic approach to enable their flexible re-use, optimization, and rapid development. To this end the paper introduces a set of basic operators on ETL processes, such as merge or invert, and motivate their use in several scenarios.