How to Keep Your Head Above Water While Detecting Errors

Free registration required

Executive Summary

Today's distributed systems need runtime error detection to catch errors arising from software bugs, hardware errors, or unexpected operating conditions. A prominent class of error detection techniques operates in a stateful manner, i.e., it keeps track of the state of the application being monitored and then matches state-based rules. Large-scale distributed applications generate a high volume of messages that can overwhelm the capacity of a stateful detection system. An existing approach to handle this is to randomly sample the messages and process a subset. However, this approach, leads to non-determinism with respect to the detection system's view of what state the application is in. This in turn leads to degradation in the quality of detection.

  • Format: PDF
  • Size: 321.7 KB