University of Colorado
Infrastructure-as-a-service (IaaS) clouds, such as Amazon EC2, offer pay-for-use virtual resources on-demand. This allows users to outsource computation and storage when needed and create elastic computing environments that adapt to changing demand. However, existing services, such as cluster resource managers (e.g. Torque), do not include support for elastic environments. Furthermore, no re-contextualization services exist to reconfigure these environments as they continually adapt to changes in demand. In this paper, the authors present an architecture for a large-scale elastic cluster environment. They extend an open-source elastic IaaS manager, the Elastic Processing Unit (EPU), to support the Torque batch-queue scheduler. They also develop a lightweight REST-based re-contextualization broker that periodically reconfigures the cluster as nodes join or leave the environment.