Self-Recovery in Server Programs

Free registration required

Executive Summary

It is important that long running server programs retain availability amidst software failures. However, server programs do fail and one of the important causes of failures in server programs is due to memory errors. Software bugs in the server code like buffer overflows, integer overflows, etc. are exposed by certain user requests, leading to memory corruption, which can often result in crashes. One safe way of recovering from these crashes is to periodically checkpoint program state and rollback to the most recent checkpoint on a crash. However, check-pointing program state periodically can be quite expensive.

  • Format: PDF
  • Size: 292.75 KB