Download now Free registration required
In this paper, the authors present a new online failure forecast system to achieve predictive failure management for fault-tolerant data stream processing. Different from previous reactive or proactive approaches, predictive failure management employs failure forecast to perform informed and just-in-time preventive actions on abnormal components only. They employ stream-based online learning methods to continuously classify runtime operator state into normal, alert, or failure, based on collected feature streams. They have implemented the online failure forecast system as part of the IBM System S stream processing system. The experiments show that the on-line failure forecast system can achieve good prediction accuracy for a range of stream processing software failures, while imposing low overhead to the stream system.
- Format: PDF
- Size: 62.76 KB