Experiences Building Network-Coding-Based Distributed Storage Systems
Large-scale distributed storage systems are prone to node failures. To provide fault tolerance, data is often encoded to maintain data redundancy over multiple storage nodes. If a node fails, it can be repaired by downloading data from surviving nodes and regenerating the lost data in a new node. Network coding has recently been proposed to generate data redundancy. It is shown that network coding can minimize the amount of data being transferred for repair, while maintaining the same fault tolerance as in conventional erasure coding schemes. Its idea is to have storage nodes first encode their stored data and then send the encoded data for regeneration.