Efficient Provenance Storage

Download Now Free registration required

Executive Summary

As the world is increasingly networked and digitized, the data the authors store has more and more frequently been chopped, baked, diced and stewed. In consequence, there is an increasing need to store and manage provenance for each data item stored in a database, describing exactly where it came from, and what manipulations have been applied to it. Storage of the complete provenance of each data item can become prohibitively expensive. In this paper, they identify important properties of provenance that can be used to considerably reduce the amount of storage required. They identify three different techniques: A family of factorization processes and two methods based on inheritance, to decrease the amount of storage required for provenance.

  • Format: PDF
  • Size: 248.1 KB