Some of the techniques used to build highly scalable servers can create an unintended performance problem for VMs;one of these is NUMA node balancing. Colin Smith provides a high -evel overview of the problem and some of the ways to address it.
The highly scalable server architectures available to modern datacenters have achieved unprecedented memory and CPU densities. As a result, VM density has also increased. Some of the techniques used to build highly scalable servers can create an unintended performance problem for VMs. One common problem is NUMA node balancing. In this post, I'll try to provide a high level overview of the problem and some of the ways to address it. Not all hypervisors deal with NUMA node issues in the same way so I have kept this post hypervisor neutral. Specifics for your virtual environment are best addressed with your vendor.
What is NUMA memory?
NUMA (Non Uniform Memory Access) hardware architectures use multiple memory buses to alleviate the contention issue in multi-processor systems. This provides a huge scalability advantage over the traditional SMP (Symmetric Multi-Processing) model when large numbers of processors are required. The architecture maps specific processors to specific high-speed buses connected to specific pools of memory. These form a NUMA node. Memory in the same NUMA node as the processor is considered local memory and can be accessed relatively quickly. Memory outside of the NUMA node is considered foreign memory and takes longer to access.
In the diagram above, VM0 will be fine as each core will have sufficient local memory available. VM1 should never get assigned cores in different NUMA nodes because a NUMA aware hypervisor should only assign a VM to a single NUMA node. VM2 will have NUMA memory fragmentation that could affect performance because there is insufficient local memory to satisfy the 12GB requirement.
In some cases, VMs will perform better on servers with less physical CPUs and the same amount of memory since each NUMA node will have more local memory. Compare a 4 processor 32GB system where each NUMA node has 8GB of local memory to 2 processor 24GB system where each NUMA node has 12GB of local memory.
How does this affect VMs?
If a VM uses memory that is not part of the same NUMA node it may have performance issues when foreign memory is required. If you have different amounts of memory in different NUMA nodes, this can be a problem if VMs are randomly distributed across nodes. Fortunately, modern hypervisors are NUMA aware and try to assign VMs with high memory footprints to nodes with more local memory. There is also the option to assign a NUMA node affinity to a VM. This overrides the hypervisors' dynamic assignment of VMs to NUMA nodes.
Some problematic scenarios
Consider a series of dormant VMs that have NUMA affinity assignments. When they spin up, they will be assigned to the NUMA node that is designated in the affinity setting. If too many VMs are assigned to the same NUMA node, there is the potential for processor resource contention within a single node while other nodes are underutilized. Additionally, the ability to overcommit memory can exacerbate the issue in some situations. What if the memory footprint of a VM is larger than the memory in the NUMA node?
There is an art to balancing the NUMA node memory and processor requirements so that VM performance is optimized. A large part of that is having a good understanding of the workloads that your VMs are running and what the impacts of poor performance might be.
In my previous post, I indicated that VM and hypervisor aware monitoring is important to get a true picture of VM and host performance. It is situations like NUMA affinity that traditional performance monitoring tools have trouble addressing. These are the types of scenarios that a new breed of performance metrics helps to manage. Simply monitoring the hosts and VMs independently is not sufficient. You need to ensure that you understand the issues, that you have instrumentation in place to provide adequate telemetry, have thresholds and trigger points defined, and most importantly the ability to react when they are reached.