Adaptive Provisioning of Stream Processing Systems in the Cloud
With the advent of data-intensive applications that generate large volumes of real-time data, Distributed Stream Processing Systems (DSPS) become increasingly important in domains such as social networking and web analytics. In practice, DSPSs must handle highly variable workloads caused by unpredictable changes in stream rates. Cloud computing offers an elastic infrastructure that DSPSs can use to obtain resources on-demand, but an open problem is to decide on the correct resource allocation when deploying DSPSs in the cloud. This paper proposes an adaptive approach for provisioning Virtual Machines (VMs) for the use of a DSPS in the cloud. The authors initially perform a set of benchmarks across performance metrics such as network latency and jitter to explore the feasibility of cloud-based DSPS deployments.