A Case for Tracking and Exploiting Inter-Node and Intra-Node Memory Content Sharing in Virtualized Large-Scale Parallel Systems
In virtualized large-scale parallel systems scientific workloads consist of numerous processes running across many virtual nodes. Their memory footprint is massive, and this has consequences for services that enhance performance, reliability, or power. The authors argue that a service that dynamically tracks the sharing of memory content, both within individual nodes, and across nodes, can simplify and enhance the implementation of such services. For example, leveraging content sharing could significantly reduce the size of a checkpoint of a group of nodes. As another example, it could speed VM migration by allowing the reconstruction of a VM's memory from multiple source VMs.