Download now Free registration required
The emergence of multi-core processors has made MPI intra-node communication a critical component in high performance computing. In this paper, the authors use a three-step methodology to design an efficient MPI intra-node communication scheme from two popular approaches: shared memory and OS kernel-assisted direct copy. They use an Intel quad-core cluster for the study. They first run micro-benchmarks to analyze the advantages and limitations of these two approaches, including the impacts of processor topology, communication buffer reuse, process skew effects, and L2 cache utilization.
- Format: PDF
- Size: 125.8 KB