Date Added: May 2012
Data de-duplication is a commonly adopted optimization in the cloud infrastructure for optimizing the storage and the transfer of Virtual Machines (VM). Data de-duplication efficiently identifies and eliminates similarity across VM image files leading to a reduction in the amount of data that needs to be stored and/or transferred. This reduction in data size is significant: recent studies show that similarity across virtual machines can be as high as 96%. Moving a group of VM images within, or across, data centers is a frequent operation to support application migration, new application deployment, as well as backup and maintenance operations. While de-duplication reduces the overall size of a group of VM images it complicates their efficient transfer and re-incarnation at the destination site.