High Performance MPI Design Using Unreliable Datagram for Ultra-Scale InfiniBand Clusters
High-performance clusters have been growing rapidly in scale. Most of these clusters deploy a high-speed interconnect, such as InfiniBand, to achieve higher performance. Most scientific applications executing on these clusters use the Message Passing Interface (MPI) as the parallel programming model. Thus, the MPI library has a key role in achieving application performance by consuming as few resources as possible and enabling scalable performance. State-of-the-art MPI implementations over InfiniBand primarily use the Reliable Connection (RC) transport due to its good performance and attractive features.