Big Data

Policy-Regulated Management of ETL Evolution

Date Added: Nov 2009
Format: PDF

In this paper, the authors discuss the problem of performing impact prediction for changes that occur in the schema/structure of the data warehouse sources. They abstract Extract-Transform-Load (ETL) activities as queries and sequences of views. ETL activities and its sources are uniformly modeled as a graph that is annotated with policies for the management of evolution events. Given a change at an element of the graph, their method detects the parts of the graph that are affected by this change and highlights the way they are tuned to respond to it. For many cases of ETL source evolution, they present rules so that both syntactical and semantic correctness of activities are retained.