Date Added: Jan 2011
Cloud computing promises high scalability, flexibility and cost-effectiveness to satisfy emerging computing requirements. To efficiently provision computing resources in the cloud, system administrators need the capabilities of characterizing and predicting server workload. In this paper, the authors use data traces obtained from a real data center to develop such capabilities. First, they search for repeatable workload patterns by exploring cross-server performance correlations resulted from the dependencies among applications running on different servers. Treating server workload data samples as multiple time series, they develop a co-clustering technique to identify groups of servers that frequently exhibit correlated workload patterns, and also the time periods in which these server groups are active.