Resilience Issues for Application Workflows on Clouds
Two areas are currently the focus of active research, namely cloud computing and high-performance computing. Their expected impact on business and scientific computing is such that most application areas are eagerly up taking or waiting for the associated infrastructures. However, open issues still remain. Resilience and load balancing are examples of such areas where innovative solutions are required to face new or increasing challenges, e.g., fault-tolerance. This paper presents existing concepts and open issues related to the design, implementation and deployment of a fault-tolerant application framework on cloud computing platforms. Experiments are sketched including the support for application resilience, i.e., fault tolerance and exception-handling.