Cloud Technologies for Bioinformatics Applications

Free registration required

Executive Summary

Executing large number of independent tasks or tasks that perform minimal inter-task communication in parallel is a common requirement in many domains. In this paper, the authors present the experience in applying two new Microsoft technologies Dryad and Azure to three bioinformatics applications. They also compare with traditional MPI and Apache Hadoop MapReduce implementation in one example. The applications are an EST (Expressed Sequence Tag) sequence assembly program, PhyloD statistical package to identify HLA-associated viral evolution, and a pairwise Alu gene alignment application. They give detailed performance discussion on a 768 core Windows HPC Server cluster and an Azure cloud.

  • Format: PDF
  • Size: 418.19 KB