SLOAVx: Scalable LOgarithmic AlltoallV Algorithm for Hierarchical Multicore Systems
Scientific applications use collective communication operations in Message Passing Interface (MPI) for global synchronization and data exchanges. Alltoall and AlltoallV are two important collective operations. They are used by MPI jobs to exchange messages among all of MPI processes. AlltoallV is a generalization of Alltoall, supporting messages of varying sizes. However, the existing MPI AlltoallV implementation has linear complexity, i.e., each process has to send messages to all other processes in the job. Such linear complexity can result in suboptimal scalability of MPI applications when they are deployed on millions of cores.