Optimized Incremental ETL Jobs for Maintaining Data Warehouses
Source: Association for Computing Machinery
ETL jobs are used to integrate data from distributed and heterogeneous sources into a data warehouse. A well-known challenge in this context is the development of incremental ETL jobs for efficiently maintaining warehouse data in the presence of source data updates. In this paper, the authors present a new transformation-based approach to automatically derive incremental ETL jobs. To this end, they consider a simplification of the underlying update propagation process based on the computation of so-called safe updates instead of true ones. Additionally, they identify the limitations of already proposed incremental solutions, which are cured by employing Magic Sets leading to dramatic performance gains.
| Format: | Size: | 325.90 | |
| Date: | Aug 2010 |



