Download Now Free registration required
Future large-scale multi-cores will likely be best suited for use within High-Performance Computing (HPC) domains. A large fraction of HPC workloads employ the Message Passing Interface (MPI), yet multi-cores continue to be optimized for shared-memory workloads. In this position paper, the authors put forth the design of a unique chip that is optimized for MPI workloads. It introduces specialized hardware to optimize the transfer of messages between cores. It eliminates most aspects of on-chip cache coherence to not only reduce complexity and power, but also improve shared memory producer-consumer behavior and the efficiency of buffer copies used during message transfers.
- Format: PDF
- Size: 117.1 KB