Provisioning Multi-tier Cloud Applications Using Statistical Bounds on Sojourn Time
In this paper, the authors present a simple and effective approach for resource provisioning to achieve a percentile bound on the end to end response time of a multi-tier application. They, at first, model the multi-tier application as an open tandem network of M/G/1-PS queues and develop a method that produces a near optimal application configuration, i.e. number of servers at each tier, to meet the percentile bound in a homogeneous server environment - using a single type of server. They then extend their solution to a K-server case and their technique demonstrates a good accuracy, independent of the variability of service-times.