Servers

Fault-Tolerant Communication Runtime Support for Data-Centric Programming Models

Free registration required

Executive Summary

The largest supercomputers in the world today consist of hundreds of thousands of processing cores and many more other hardware components. At such scales, hardware faults are a commonplace, necessitating fault-resilient software systems. While different fault-resilient models are available, most focus on allowing the computational processes to survive faults. On the other hand, the authors have recently started investigating fault resilience techniques for data-centric programming models such as the Partitioned Global Address Space (PGAS) models. The primary difference in data-centric models is the decoupling of computation and data locality.

  • Format: PDF
  • Size: 184.9 KB