Efficient Replica Maintenance for Distributed Storage Systems
This paper considers replication strategies for storage systems that aggregate the disks of many nodes spread over the Internet. Maintaining replication in such systems can be prohibitively expensive, since every transient network or host failure could potentially lead to copying a server's worth of data over the Internet to maintain replication levels. The following insights in designing an efficient replication algorithm emerge from the paper's analysis. Durability can be provided separately from availability; the former is less expensive to ensure and a more useful goal for many wide-area applications. The focus of a durability algorithm must be to create new copies of data objects faster than permanent disk failures destroy the objects.