Design and Evaluation of an Online Anomaly Detector for Distributed Storage Systems
Performance problems, which may stem from different system components, such as network, memory, and storage devices, are difficult to diagnose and isolate in distributed storage systems. In this paper, the authors present a performance anomaly detector which is able to efficiently detect performance anomaly and accurately identify the faulty sources in a system node of a distributed storage system. Their method exploits the stable relationship between workloads and system resource statistics to detect the performance anomaly and identify faulty sources which cause the performance anomaly in the system. Their experimental results demonstrate the efficiency and accuracy of the proposed performance anomaly detector.