European Design and Automation Association
Current technology scaling is leading to increasingly fragile components, making hardware reliability a primary design consideration. Recently researchers have proposed low-cost reliability solutions that detect hardware faults through software-level symptom monitoring. SWAT (SoftWare Anomaly Treatment), one such solution, demonstrated with microarchitecture-level simulations that symptom-based solutions can provide high fault coverage and a low Silent Data Corruption (SDC) rate. However, more accurate evaluations are needed to validate such solutions for hardware faults in real-world processor designs.