Quantifying Performance Benefits of Overlap Using MPI-2 in a Seismic Modeling Application
AWM-Olsen is a widely used ground motion simulation code based on a parallel finite difference solution of the 3-D velocity-stress wave equation. This application runs on tens of thousands of cores and consumes several million CPU hours on the TeraGrid Clusters every year. A significant portion of its run-time (37% in a 4,096 process run), is spent in MPI communication routines. Hence, it demands an optimized communication design coupled with a low-latency, high-bandwidth network and an efficient communication subsystem for good performance. In this paper, the authors analyze the performance bottlenecks of the application with regard to the time spent in MPI communication calls.