Adaptive System Anomaly Prediction for Large-Scale Hosting Infrastructures
Large-scale hosting infrastructures require automatic system anomaly management to achieve continuous system operation. In this paper, the authors present a novel adaptive runtime anomaly prediction system, called ALERT, to achieve robust hosting infrastructures. In contrast to traditional anomaly detection schemes, ALERT aims at raising advance anomaly alerts to achieve just-in-time anomaly prevention. They propose a novel context-aware anomaly prediction scheme to improve prediction accuracy in dynamic hosting infrastructures. They have implemented the ALERT system and deployed it on several production hosting infrastructures such as IBM System S stream processing cluster and PlanetLab. The experiments show that ALERT can achieve high prediction accuracy for a range of system anomalies and impose low overhead to the hosting infrastructure.