Big Data

Rule-Based Management of Schema Changes at ETL Sources

Download Now Date Added: Oct 2009
Format: PDF

In this paper, the authors visit the problem of the management of inconsistencies emerging on ETL processes as results of evolution operations occurring at their sources. They abstract Extract-Transform-Load (ETL) activities as queries and sequences of views. ETL activities and its sources are uniformly modeled as a graph that is annotated with rules for the management of evolution events. Given a change at an element of the graph, their framework detects the parts of the graph that are affected by this change and highlights the way they are tuned to respond to it. They then present the system architecture of a tool called Hecataeus that implements the main concepts of the proposed framework.