On Failure Detection Algorithms in Overlay Networks

One of the key reasons overlay networks are seen as an excellent platform for large scale distributed systems is their resilience in the presence of node failures. This resilience relies on accurate and timely detection of node failures. Despite the prevalent use of keep-alive algorithms in overlay networks to detect node failures, their tradeoffs and the circumstances in which they might best be suited are not well understood. In this paper, the authors study how the design of various keep-alive approaches affect their performance in node failure detection time, probability of false positive, control overhead, and packet loss rate via analysis, simulation, and implementation.

Provided by: UC Regents Topic: Networking Date Added: Jan 2011 Format: PDF

Find By Topic