Be Conservative: Enhancing Failure Diagnosis with Proactive Logging
When systems fail in the field, logged error or warning messages are frequently the only evidence available for assessing and diagnosing the underlying cause. Consequently, the efficacy of such logging - how often and how well error causes can be determined via postmortem log messages - is a matter of significant practical importance. However, there is little empirical data about how well existing logging practices work and how they can yet be improved. The authors describe a comprehensive study characterizing the efficacy of logging practices across five large and widely used software systems. Across 250 randomly sampled reported failures, they first identify that more than half of the failures could not be diagnosed well using existing log data.