Provenance Based Rebuild: Using Data Provenance to Improve Reliability
Traditionally, data preservation and reliability have used Error Correcting Codes (ECCs) to ensure data safety. The development of general data provenance tracking systems provides a new opportunity for data reliability. The authors present a method that utilizes provenance to determine a datum's generating process and inputs, and then uses this information to recompute lost data. This method, called Provenance Based Rebuild (PBR) provides a new, complimentary reliability mechanism that integrates with traditional systems to offer a variety of benefits including fine grained prioritized rebuild and parallel rebuild. While PBR offers benefits that address weaknesses in current techniques, it also faces a number of challenges such as data placement, and infrastructure provisioning.