Server virtualization has been growing in popularity for a decade, and some people now believe it’s not just popular but standard practice. But what is the most up-to-date advice for planning, implementing, and maintaining a virtualization project? We asked two experts: David Barker, founder and technical director of 4D Data Centers in London, UK, and Peter Rittwage, partner and senior technical engineer at IntelliSystems with offices throughout Georgia and South Carolina. (Note: This article about server virtualization is available as a free PDF download.)
SEE: Virtualization policy (Tech Pro Research)
Koblentz: Describe the typical size of your customer accounts.
Barker: Our clients range in size from small businesses with a few employees up to large enterprises with over 1,000 employees. The overall client demographic is a mixture of colocation, public cloud, and private managed clouds. While colocation represents the largest share of our business within the context of virtualization, the majority of the smaller clients reside on the public cloud platform that we operate, while the larger enterprises tend to go for private managed cloud platforms based around [Microsoft] Hyper-V or [Dell Technologies] VMware.
Rittwage: The typical size is about 25 users, although we have some with 300+ and some with just a few computers.
Koblentz: What are the biggest challenges when virtualizing servers these days?
Barker: The biggest challenge in virtualization is still the sharing of resources across your infrastructure and applications. Whichever way you look at it some things will need to be prioritized over others within the infrastructure.
When designing a virtualized platform it is a balancing act between the competing resources, and most likely you will still have bottlenecks but hopefully moved those bottlenecks to where they have the least impact on your applications. You would need to consider the network provision, both for external WAN traffic as well as storage traffic. If you are consolidating from 100 physical machines each with 1Gb network interface that are fairly heavily utilized down to 10 hypervisor nodes, it is likely you will need to bump the network to at least 10 Gb in order to cope with the condensed traffic of those systems running on a reduced number of NICs. You can’t always expect to pick the existing network up and drop it into a newly virtualized environment.
Similar issues exist with the storage. Most virtualized deployments still provision a central storage array, and this is quite often the bottleneck for virtualized system deployment. While having a 10 Gb storage network will likely provide you enough raw storage throughput to the array, the raw disk I/O available from the physical disks is often overlooked due to it being less of an issue when applications are spread over a number of physical servers. This means the disks can’t keep up with the number of read/writes being thrown at them from the number of virtual machines, and performance will start to be affected, especially in things like database applications, which rely heavily on disk I/O.
Rittwage: We still run into hardware security dongles that need to be attached to USB, and sometimes they will not “poke through” the virtualization layer into the VM guest. We also still occasionally run into a software vendor that doesn’t “support” virtualization and then won’t help with the product support, but that is more rare now.
SEE: Cloud v. data center decision (ZDNet special report) | Download the report as a PDF (TechRepublic)
Koblentz: What are the solutions to address those challenges when you’re planning a virtualization project?
Barker: While there are technical solutions that can help to alleviate some of these issues, such as SSD caching within the storage array or moving to a clustered storage platform, they do have their own drawbacks which would need to be considered when looking at them to mitigate the challenges.
One of the best ways to mitigate the issues is through detailed benchmarking of the current physical servers and planning on how you are going to virtualize the infrastructure. Before making any hardware or virtualization decisions, you want to know how much bandwidth each server uses for WAN traffic, current CPU/RAM utilization during normal loads, as well as peak loads and the amount of disk I/O that is going on within each server.
By having this information early on, you can make decisions on hardware procurement that will at least deliver the current performance and hopefully improve performance through newer chipsets, better memory, etc. It also pays to ensure that you have properly mapped out failure scenarios within the virtualized environment and that there are spare hypervisor resources available to support at least the failure of a physical hypervisor node so that the virtual machines running have resources to migrate into without overly impacting the performance of virtual machines and applications already running on those nodes.
SEE: A guide to data center automation (ZDNet special report) | Download the report as a PDF (TechRepublic)
Rittwage: Usually there is an alternate licensing solution available other than hardware keys, but you have to know about it before the migration. There is also software to virtualize USB devices.
Koblentz: What are the common things that people do wrong when they’re actually installing/configuring/maintaining virtualization software?
Barker: The usual things that go wrong when deploying virtualization could be summed up as follows:
1. Improper balancing of node resources. This would be something like putting in 24 core CPUs with only 64 GB of RAM. In a virtualized environment RAM isn’t shared between virtual machines, and you are likely to run out of memory way before you run out of CPU (which can usually be oversubscribed more than originally planned, but a good rule of thumb is 1:4 with 1 physical core to 4 virtual cores).
2. Mismatching storage to requirements. It’s probably more important to get disk sized correctly than CPU–storage costs will very rapidly escalate compared to provisioning CPU cores. Remember that 10 Gb iSCSI is very fast, and spinning disk is actually very slow. If you have a lot of high transaction databases that you are trying to virtualize, you will need a lot of disk I/O, which likely means a large array of 15k disks.
3. Too many networks and too many virtual switches. Quite often you will see virtualized environments with a lot of networks with vLANs for each guest virtual machine and the management IP address of the hypervisor node present in each vLAN. This generally isn’t required (the management IP doesn’t need to be in the same networks as the guest virtual machines) and only adds to the complexity of your management of the platform. Unless there is a very specific requirement for that level of network separation, keep networks to a minimum and use access lists or firewall rules to manage virtual machine separation on the network.
4. In a similar vein there are quite often too many virtual switches. If you do require a lot of vLANs for your environment, then you don’t usually require a separate virtual switch for each vLAN, and proper design of vLANs/virtual switches will provide enough network isolation for most use cases.
Rittwage: Misconfiguration of vCPUs, RAM, or storage is common. Most problems I have to fix are where an administrator has over-committed shared storage. You can configure large dynamic drives that don’t take much space at first, but if you let them grow out of control, you can run out of space for all your guest VMs without proper planning. You must also pay very close attention to hardware quality and stability so that you don’t create a dangerous single point of failure in your network by consolidating all your servers. Always have redundant hardware.
Koblentz: The best way to do something in 2008 or 2013 isn’t necessarily the best way to do it in 2018. What trends from virtualization’s early days have gone away?
Barker: The basic principle of virtualization has remained the same from when VMware introduced its workstation product in 1999 and ESX in 2001. We have seen performance increases and increased demands on the storage in particular.
Probably the biggest shift has been in the areas of virtualization management, networks, and virtual machine migration. In the early days, virtual machines tended to be very static–you would virtualize a physical server and have multiple virtual machines running within that server that didn’t move anywhere; and if the physical server failed, then all virtual machines on that server would also fail. The introduction of products such as vMotion addressed this and provided for large clusters of hypervisors where virtual machines could easily migrate between the physical servers in the event of failure; this has been taken further with VMware’s vMotion and Hyper-Vs Replica allowing virtual machines to be replicated in near-real time to separate clusters in physically separate locations and to address the risk of a complete cluster failure.
Rittwage: Storage virtualization used to be much slower, so I would see raw drive partitions or drives allocated to VMs. This is not the case anymore or needed. There is little-to-no penalty for local virtual storage.
Koblentz: What concerns about its future (now) have proven to be unfounded? Conversely, which ones turned out to be underestimated?
Barker: I think the biggest concerns, which both turned out to be unfounded, have been around the security of using virtualization and the risks of having multiple virtual machines running within the same physical infrastructure. While there has recently been the release of Spectre and Meltdown vulnerabilities within the CPU architectures that have reignited some of these concerns, patches have been released quickly and the exploit required root or administrator access to the systems themselves (if an attacker has that information to your private cloud, it is a far larger problem). In general, resource isolation and virtual machine isolation has been found to be completely secure, and issues generally arise when these are misconfigured during deployment. A properly designed virtual environment with network isolation and storage isolation (if needed) is very secure.
Rittwage: There has always been talk about malware/viruses that could attack the hypervisor, but I have not seen one. I suspect it’s very difficult to program such a thing.
Koblentz: In what context should you opt for a minor and/or application-specific virtualization product vs. using the big boys?
Barker: In 99 percent of use cases virtualization using Hyper-V, VMware, or KVM/Xen is going to be the way to go, and the decision comes down to the skills present to manage those platforms as well as an appetite to pay the licensing costs (which scale from KVM/Xen to Hyper-V and to VMware as the most expensive).
VMware has excellent management tools and a track record in providing hardware virtualization, but it comes at a relatively hefty price, especially if you are putting a large deployment together.
If you are primarily a Windows environment and most of the guest machines are going to be running Windows Server, then a Hyper-V environment may be preferable. The licensing costs can be lower if deployed correctly with Windows Data Centre edition or using Windows Server Hyper-V Core, and the management interfaces will be familiar to users.
SEE: Microsoft’s latest Windows Server 2019 test build includes first preview of Hyper-V 2019 (ZDNet)
KVM and Xen are both excellent open-source hypervisor platforms, but they lack management interfaces. While there are options to address this such as going for an OpenStack environment or using a front-end such as OnApp, these do add some complexity to the design if you don’t have prior experiencing in using those tools or open source software in general.
Rittwage: I’m not sure I would deploy anything except for the majors for any critical business role, but for practice and learning about the product, or for temporary disaster recovery situations, I’ve seen VirtualBox used.
Koblentz: In what context should you opt not to virtualize a server?
Barker: Most workloads can be virtualized, but if you have applications with particularly heavy CPU/RAM usage or very heavy disk I/O, then it may be better to have them as standalone servers within a wider virtualized environment. You can also have the physical server deployed as a hypervisor, but with only a single virtual machine running on it, which can be a good way to ensure the required resources are available to that application while keeping the benefits of management and migration that a virtualized environment can bring.
SEE: Photos: Server room real-world nightmares (TechRepublic)
Likewise, legacy applications can be an issue to put into a virtual environment–not all applications will sit happily with virtual CPUs or virtual NICs, as they have been designed to speak to the physical hardware itself. Due to the maturity of the virtualization market, these applications are becoming far fewer and less of a concern as time goes on.
Rittwage: Generally, if you plan to use all the resources for one specific high-CPU or high-IOP function, such as a busy SQL server, there is little reason to virtualize that. Virtualization is about sharing the underlying hardware with other tasks.
Koblentz: Looking forward another five years, what do you think will be new challenges/concerns in virtualization that aren’t yet clear to most people?
Barker: Mostly I suspect this will be around a shift to more network virtualization on the physical network hardware in order to support workloads and virtual machines that are regularly migrating between hypervisor nodes, and it will mean ensuring that the physical network infrastructure that supports your virtual infrastructure is properly designed for SDN, scripting, and vxLANs.
Another area will be the continued increase in the use of containerization within the virtual machines–products such as Docker and Kubernetes provide for OS and application virtualization within the virtual machine itself. In the right use cases, this brings massive benefits in speed of deployment, consistency of the environment, and the ability to migrate application workloads instantly between virtual machines.
Rittwage: It’s pretty mature at this point, so I’m not sure what new challenges will show up in the next 5 years.
Koblentz: Generally, what other advice do you have for people in charge of implementing and maintaining server virtualization projects?
Barker: Plan for growth. During the design phase, after you have your benchmarking of the existing environment, make sure to plan for how you’ll expand the platform with new hypervisors or additional storage in a way that minimizes impact on the environment. With virtualized environments, there is an expectation of much higher availability, and you need to be able to add in another set of disks or another four hypervisors without having to re-architect the whole platform because there were only enough switch ports for the initial build.
Also, make sure you still have a good backup strategy. Although everything is now virtualized and likely a lot more resilient to the failure of a physical component of the infrastructure, things do still go wrong. Having everything virtualized opens up some other backup strategies with snapshots of virtual machines and technologies such as [backup appliances], which can make taking backups, managing the backups, and restoring far easier than when everything was on its own individual servers.
Rittwage: Plan for performance, growth, and redundancy. People expect to be able to use an expensive server for 5 years or more. Use a consultant that has successfully moved many companies to virtualization.