Université Paris Diderot
Due to many factors such as, high transistor density, high frequency, and low voltage, today's processors are more than ever subject to hardware failures. These errors have various impacts depending on the location of the error and the type of processor. Because of the hierarchical structure of the compute units and work scheduling, the hardware failure on GPUs affects only part of the application. In this paper, the authors present a new methodology to characterize the hardware failures of Nvidia GPUs based on a software micro-benchmarking platform implemented in OpenCL.