Date Added: Aug 2011
Recovery from transient failures is one of the prime issues in the context of distributed systems. These systems demand to have transparent yet efficient techniques to achieve the same. Checkpoint is defined as a designated place in a program where normal processing of a system is interrupted to preserve the status information. Checkpointing is a process of saving status information. Mobile computing systems often suffer from high failure rates that are transient and independent in nature. To add reliability and high availability to such distributed systems, checkpoint based rollback recovery is one of the widely used techniques for applications such as scientific computing, database, telecommunication applications and mission critical applications.