University of California
Manufacturing and environmental variations cause timing errors that are typically avoided by conservative design guardbands or corrected by circuit level error detection and correction. These measures incur energy and performance penalties. This paper considers methods to reduce this cost by expanding the scope of variability mitigation through the software stack. In particular, the authors propose workload deployment methods that reduce the likelihood of timing errors in shared memory clusters of processor cores. This and other methods are incorporated in a runtime layer in the OpenMP framework that enables parsimonious countermeasures against timing errors induced by hardware variability.