Self-Healing Mechanism for Reliable Computing
One of the four major processes of an autonomic computing system is to free people from discovering, recovering, and failures, this process is called self-healing. Systems designed to be self-healing are able to heal themselves at runtime in response to changing environmental or operational circumstances. Thus, the goal is to avoid catastrophic failure through prompt execution of remedial actions. This paper proposes a self-healing mechanism that monitors, diagnoses and heals its own internal problems using self-awareness as contextual information. The self-management system that encapsulates the self-healing mechanism related to reliability improvement addresses: monitoring layer, diagnosis & decision layer, and adaptation layer, in order to perform self-diagnosis and self-healing. To confirm the effectiveness of self-healing mechanism, practical experiments are conducted with a prototype system.