High Performance Dense Linear System Solver With Resilience to Multiple Soft Errors
In the multi-peta-flop era for supercomputers, the number of computing cores is growing exponentially. However, as integrated circuit technology scales below 65 nm, the critical charge required to flip a gate or a memory cell has been dangerously reduced, causing higher cosmic-radiations-induced soft error rate. Soft error threatens computing system by producing silently data corruption which is hard to detect and correct. Current research of soft errors resilience for dense linear solver offers limited capability when facing large scale computing systems, and suffers from both soft error and round-o error due to floating point arithmetic.