Balancing Soft Error Coverage With Lifetime Reliability in Redundantly Multithreaded Processors

Silicon reliability is a key challenge facing the micro-processor industry. Processors need to be designed such that they are resilient against both soft errors and lifetime reliability phenomena. However, techniques developed to address one class of reliability problems may impact other aspects of silicon reliability. In this paper, the authors show that Redundant Multi-Threading (RMT), which provides soft error protection, exacerbates lifetime reliability. They then explore two different architectural approaches to tackle this problem, namely, Dynamic Voltage Scaling (DVS) and partial RMT. The authors show that each approach has certain strengths and weaknesses with respect to performance, soft error coverage, and lifetime reliability.