Large-Scale Privacy-Preserving Mapping of Human Genomic Sequences on Hybrid Clouds
An operation preceding most human DNA analyses is read mapping, which aligns millions of short sequences (called reads) to a reference genome. This step involves an enormous amount of computation (evaluating edit distances for millions upon billions of sequence pairs) and thus needs to be outsourced to low-cost commercial clouds. This asks for scalable techniques to protect sensitive DNA information, a demand that cannot be met by any existing techniques (e.g., homomorphic encryption, secure multiparty computation). In this paper, the authors report a new step towards secure and scalable read mapping on the hybrid cloud, which includes both the public commercial cloud and the private cloud within an organization.