On the Speedup of Single-Disk Failure Recovery in XOR-Coded Storage Systems: Theory and Practice

Free registration required

Executive Summary

Modern storage systems stripe redundant data across multiple disks to provide availability guarantees against disk failures. One form of data redundancy is based on XOR-based erasure codes, which use only XOR operations for encoding and decoding. In addition to providing failure tolerance, a storage system must also provide fast failure recovery to avoid data unavailability. The authors consider the problem of speeding up the recovery of a single-disk failure for arbitrary XOR-based erasure codes. They address this problem from both theoretical and practical perspectives. They propose a replace recovery algorithm, which uses a hill-climbing technique to search for a fast recovery solution, such that the solution search can be completed within a short time period.

  • Format: PDF
  • Size: 369.29 KB