Automatic Run-Time Parallelization and Transformation of I/O
As the size of computational clusters grows, one can expect that I/O will consume an increasing portion of wall-clock time as the problem and node sizes are scaled up, unless parallel I/O is introduced. Unfortunately, using parallel I/O is non-trivial, so few applications developed by individual researchers enjoy its benefits. In this paper, the authors describe their novel method for analyzing I/O and communication operations at run-time. When nodes perform I/O or communication operations, their technique protects the memory associated with the requests from the application. Subsequent operations are analyzed for overlap between communication and I/O operations.