Using Cloud Computing for Parallel Analysis of Genome-Wide Datasets

Free registration required

Executive Summary

Analysis of today's genome-wide datasets poses ever-increasing demands for computational capacity. For example, imputation, empirical p-values, epistatic analysis or QTL analysis of a large number of traits may easily take days or even months on a single computer. For many analysis tasks, a linear increase in performance can be achieved by parallelization - partitioning the data by markers, subjects, or traits. Parallelization requires utilizing multiple calculation servers. However, acquiring and maintaining a computational cluster is expensive. As an alternative to costly calculation clusters, the emergence of cloud computing offers a new way of acquiring more computational power without initial investments and maintenance costs.

  • Format: PDF
  • Size: 228.7 KB