High Performance Parallel Computing With Clouds and Cloud Technologies

Date Added: Jun 2009
Format: PDF

Infrastructure services (Infrastructure-as-a-service), provided by cloud vendors, allow any user to provision a large number of compute instances fairly easily. Whether leased from public clouds or allocated from private clouds, utilizing these virtual resources to perform data/compute intensive analyses requires employing different parallel runtimes to implement such applications. Among many parallelizable problems, most "Pleasingly Parallel" applications can be performed using MapReduce technologies such as Hadoop, CGL-MapReduce, and Dryad, in a fairly easy manner. However, many scientific applications, which require complex communication patterns, still require optimized runtimes such as MPI.