
Data Centers
HardwareGPU-Qin: A Methodology for Evaluating the Error Resilience of GPGPU Applications
While Graphics Processing Units (GPUs) have gained wide adoption as accelerators for General-Purpose applications (GPGPU), the end-to-end reliability implications of their use have not been quantified. Fault injection is a widely used method for evaluating the reliability of applications. However, building a fault injector for GPGPU applications is challenging due to their massive parallelism, which ...