Reliability Aware Exceptions for Software Directed Fault Handling
Today reliability emerges as a first order design constraint. Faults encountered in a chip can be classified into three categories: transient, intermittent and permanent. Fault classification allows a chip designer to provide the appropriate corrective action for each fault type. However, fault classification and correction are expensive mechanisms to implement in hardware. In spite of their criticality faults are still relatively rare; hence classification and recovery mechanisms should be very low cost. In this paper, the authors present a new class of exceptions called Reliability Aware Exceptions (RAE).