Implementation of Watch Dog Timer for Fault Tolerant Computing on Cluster Server

Download Now Free registration required

Executive Summary

In today's new technology era, cluster has become a necessity for the modern computing and data applications since many applications take more time (even days or months) for computation. Although after parallelization, computation speeds up, still time required for much application can be more. Thus, reliability of the cluster becomes very important issue and implementation of fault tolerant mechanism becomes essential. The difficulty in designing a fault tolerant cluster system increases with the difficulties of various failures. The most imperative obsession is that the algorithm, which avoids a simple failure in a system, must tolerate the more severe failures. In this paper, the authors implemented the theory of watchdog timer in a parallel environment, to take care of failures.

  • Format: PDF
  • Size: 278.4 KB