Association for Computing Machinery
The increasing pressure to make hardware resilient to runtime failures has prompted development of design techniques for specific classes of systems, e.g. processors and routers. However, these techniques come at increased design and verification costs, thus limiting their broader application. In this paper, the authors describe a methodology for general RTL designs based on the widely usable checkpointing and rollback resiliency mechanism. They take a modeling and language approach that provides an appropriate set of abstractions for the resiliency logic. This cleanly separates the main design behavior from the resiliency behavior, leading to ease of design.