Date Added: Oct 2011
Distributed applications running inside cloud are prone to performance anomalies due to various reasons such as insufficient resource allocations, unexpected workload increases, or software bugs. However, those applications often consist of multiple interacting components where one component anomaly may cause its dependent components to exhibit anomalous behavior as well. It is challenging to identify the faulty components among numerous distributed application components. In this paper, the authors present a Propagation-aware Anomaly Localization (PAL) system that can pinpoint the source faulty components in distributed applications by extracting anomaly propagation patterns.