Ten Million and One Penguins, Or, Lessons Learned From Booting Millions of Virtual Machines on HPC Systems
In this paper, the authors describe Megatux, a set of tools they are developing for rapid provisioning of millions of virtual machines and controlling and monitoring them, as well as what they have learned from booting one million Linux virtual machines on the Thunderbird (4660 nodes) and 550,000 Linux virtual machines on the Hyperion (1024 nodes) clusters. As might be expected, their tools use hierarchical structures. In contrast to existing HPC systems, their tools do not require perfect hardware; that all systems be booted at the same time; and static configuration files that define the role of each node. While they believe these tools will be useful for future HPC systems, they are using them today to construct botnets.