Triage: Diagnosing Production Run Failures at the User's Site

Source: Association for Computing Machinery

Favorite

Free registration required

Diagnosing production run failures is a challenging yet important task. Most previous work focuses on offsite diagnosis, i.e. development site diagnosis with the programmers present. This is insufficient for production-run failures as: It is difficult to reproduce failures offsite for diagnosis; offsite diagnosis cannot provide timely guidance for recovery or security purposes; it is infeasible to provide a programmer to diagnose every production run failure; and privacy concerns limit the release of information (e.g. Coredumps) to programmers. To address production-run failures, the authors propose a system, called Triage that automatically performs onsite software failure diagnosis at the very moment of failure.
Format:PDF Size:158.53
Date:Oct 2007
People who downloaded this item also downloaded