Windows

Considerations for virtualizing all Active Directory domain controllers

Virtualizing Active Directory domain controllers may not seem like a good idea, but IT pro Rick Vanover argues why you should embrace it.

One of the more contentious topics in Windows administration is whether to virtualize Active Directory domain controllers (ADDCs). I believe virtualizing ADDCs should be fully embraced, and that they can be 100% virtualized under the right conditions.

If all ADDCs are running as virtual machines, a few key safeguards need to be met. Primarily, a single domain of failure (or region of failure) cannot be the virtual environment. This means that in VMware vSphere configurations, one cluster, set of ESXi hosts, and shared storage resources should not contain all of the ADDCs.

There are a couple of ways to spread all of the ADDCs across multiple ESXi hosts to ensure that one domain of failure is not limited to the vSphere cluster. One option is to have one or more ESXi hosts provisioned to different storage resources and management realms (like a vCenter Server) that contain additional domain controllers. This may mean putting a few production ADDCs within a development vSphere cluster that has the virtual machines running on different hosts, different storage, and a different management realm than the production virtual machine workloads. This can be a development vCenter Server or even an unmanaged ESXi host or Hyper-V host.

The management software, such as vCenter Server or System Center Virtual Machine Manager, may need Active Directory to start the required services for management. If the management services can't start, the virtualized ADDCs may not be able to start either. You can see how the situation can be complicated if Active Directory is not available for the management stack.

Another trick you can employ is to have an ADDC leveraging local storage on an ESXi or Hyper-V host. While local storage isn't ideal in terms of availability compared to that of a SAN or a NAS resource, it can be a way to extend the failure to another storage resource to reach the goal of having all of the ADDCs virtualized.

In addition, splitting out the FSMO roles across a number of virtualized ADDCs that are distributed across different domains of failure in the virtualized infrastructure will mitigate risk. In that sense, a role seizure would be more desirable if a single virtualized ADDC becomes unavailable.

These considerations are only a primer to the bigger decision of virtualizing ADDCs; the key takeaway is to ensure that virtualized infrastructure is capable of accommodating limited failure.

What design principles have you employed to virtualize your ADDCs? Share your tips with the TechRepublic community.

About

Rick Vanover is a software strategy specialist for Veeam Software, based in Columbus, Ohio. Rick has years of IT experience and focuses on virtualization, Windows-based server administration, and system hardware.

26 comments
OffshoreTechie
OffshoreTechie

Sorry, I cant support this based on the arguments in the article. I have Vm'd most boxes and use VMWare on EVAs. whilst this has saved space and power, and has simplified infrastructure management to an amazing degree, I have resisted all attempts by management and consultants to go 'all the way'. what the article is saying is that to make sure you are covered, you need to put in an over complicated infrastructure, spread across sites (FSMO? Are you sure you want to do that?), across multiple VM servers. Isn't that just negating the purpose of virtualising? If the worst then is to happen whilst your AD guy is off, how are you going to explain that to a contractor? Don't knee-jerk to say "documentation" - the more complex the solution, the larger the learning curve (especially when the MD is breathing down your neck - weve all been there :) ) For the option of Hyper-V *and* ESX, vendor diversity is not what is on offer here. Basically, this just means you are covered for patches etc, but to be honest, if we are looking at different hosts on different SANs, this is all at a different layer. Same with security issues mentioned above. One would assume that this would be a concern on local disk, but with a SAN and a proper No Disk Return policy, this is covered as well. I was amused by the local disk option though. I picture explaining to the finance director that the business suffered downtime due to a disk failure, and his response being along the lines of "Why did I spend $3m on a SAN then?", followed swiftly by "I'll get someone else to do your job". The One physical, n-virtual offers to me the closest match for doing this, but again, where you you place this risk? I have to agree with doug.montgomery@ "Servers are so inexpensive these days, I dont understand the advantage to virutalize this critical piece of the network". Even cheaper if you re-use a sever that has just been virtualised! In my mind, Microsoft has spend a lot of time and effort into making AD as robust as possible, why not take advantage of their investment and have a couple of servers just doing the job? Where I would focus on is backing up the role master and having contingency to present on a VM in times of crisis. This way your infrastructure is in the contractor's handbook, your recovery times are less, and you know it will work. As my department has been at the end of this particular stick, I would urrge you not to be too creative outside the lab for somethng like DC's. (My pain involved both an old historic NT4 DC and a 2003 AD as well on the virtual servers. The vCentre server was also on the virtual box. It isnt like that now! What a load of hassle)

josiah
josiah

I have had great reliability in virtualizing ALL my DC's for my clients. With clients of multiple hosts creating rules to keep DC's on separate hosts at all times resolves the host failure issue, with customers that have more robust infrastructures say offsite storage, hot site, external data center, that would include DC's in all locations where possible, then if say you get a lightning strike in a facility you still have the external data center or hot site to fail over to. Not to mention you house replicas of your VM???s in the offsite storage, then your only limit is how quickly you can copy 20GB or 40GB depending on if your running 2k3 or 2k8 DC???s. For single host clients there just has to be good regular .vmdk backups. I find better performance, and continuity with virtualized DC's. Not to mention the ease of bringing up an ESX Host in the event of a disaster. With good documentation and access to hardware you can have a secondary ESX Host up in less than an hour. Holding onto a single physical domain controller in my opinion is a false sense of security. Ultimately going ALL in with virtual technology and having a good DR plan in place gives you the best of both worlds.

websisc
websisc

We are implementing Virtual ADs and Using Hyper-V with System Center Virtual Machine Manager and a SAN. All of our equipment is on HP ProLiants and we are embracing the technology! We currently have a 2 site setup with a locally installed physical FSMO DC at each site and 2-virtual ADs. We upgraded our entire infrastructure from 30 servers to 12.

pcastill
pcastill

I provide services for a mid-size company. We virtualized our ADDC using XenServer. Other services also run on that box on separate virutal machines. We did this as part of a consolidation plan. We also keep a weekly back up of the virtual server. Since the consolidation resulted in 1 spare server, we implemented a contingency plan in which the spare server can bring AD back into production within a couple of hours from the backup.

terencesmith79
terencesmith79

If you are hosting DHCP and DNS on a VM, and at best if you are using DHCP reservations and static DNS records for your ESXi hosts, a total server outage (reboots of all ESXi host, power outage, etc.) could be a major problem. Your ESXi hosts won't be able to obtain IP addresses, since the DHCP server won't be started yet. And another problem could be that your other guests boot up before your DHCP/DNS server(s).. So if DHCP and DNS are hosted on a VM, allocate static IPs to your ESXi hosts and set your DHCP & DNS server VM(s) to start up before the other guests do.

joshi_at
joshi_at

I??ve virtualized all of my servers (except my SCCM installation, for other reasons) and I??m happy. Having another domain controller reachable over my WAN-Link makes me somewhat immortal to DC failures onsite (configures and tested!). The only thing I have in spare is a DHCP-Server, which isn??t enabled.

david.thomas
david.thomas

I know lots of people like to virtualise their ADDC's but the key issues for me are: - Security, you've just made your directory easily portable, someone can take a copy of your DC Virtual Disk and brute force / hack it at leasure at a remote location. - Lifecycle, because your ADDC is now virtual, it usually means there are snapshots of your disks lying around. The last thing you want is an old version of a DC coming back online and someone adding that as a new DC. If older than the tombstone period, (typically 180 days) your AD environment will be flooded with zombie objects that can't be easily deleted. That is enough for me to avoid the Virtual environment completely. If you do decide to go down the virtual path, ensure you have some Physical DC's to survive a VM Host failure, and also ensure your AD Recovery process is on Physical machines, if you suffer a total loss you'll need to recover AD on physical machines.

Daniel Breslauer
Daniel Breslauer

Say you have two ADDCs - how about having one virtualized in vmWare, the other in Hyper-V (on different hosts, obviously). For their storage obviously use something proper as well and keep plenty of redundancy (I'd even consider spreading it over three or more - of course it all depends on the size of the company and the budget).

b4real
b4real

Thanks for sharing your configuration.

RTHJr
RTHJr

With DCs, if there is only one DC left in the domain and virtualized, I find I must use a local logon accout to be able to authenticate onto the VM console to get to my DC! Otherwise, if something goes wrong, cannot solve a problem with the DC on account of inability to log on to the hypervisor console.

nonimportantname
nonimportantname

Why anyone would allow DHCP to assign IP addresses to their VM HOSTS is beyond me.

b4real
b4real

Good point. But, why would anyone hold the security of a domain controller to any different level than any other workload? By that logic, virtualization would be forbidden the way I'd see it.

Wonko79
Wonko79

It's not only the zombies, but you will also have a condition called USN rollback... To me security is the big point, but that depends on the size of the company and who is managing the servers. In our organization the virtual hosts are managed by other teams, and we would never allow a virtual Domain Controller (except Read Only) because we cannot control who gets access to the VHDs. It is far too easy to compromise the entire domain or even forst by doing this, and considering that a domain controller runs fine on physical hardware that costs $2000-$3000 for about 5 years lifetime we think it's worth the money.

nonimportantname
nonimportantname

We haven't pinned the actual cause of the issue yet, but we noticed that our virtualized domain controllers were receiving (and thus, disseminating) the wrong time. The discrepant value would increase over time and have the potential to muck everything up. We've tried difference source NTP servers on the net and still got the same issues. As we all know, Time in Windows is utterly important and something that is really taken for granted and due to the nature of the product we provide, the discrepancy was unacceptable. We THINK that their is some issue between the vm host and the virtualized DC and have decided to attack that angle by building new, physical DCs. The corridor between the virtual environment and the physical one is just another layer that can fail for such a critical service and in our mind, there's little potential reward for the high risk.

monte.sadler
monte.sadler

We keep one DC on the physical side and Virtualized the other 5, just remember your physical DC needs to be either your primary or secondary DNS in case of Major failure on a HA and DRS vm environment.

ssabanis
ssabanis

Virtualizing all ADDC may not be such a good idea. Keeping on DC ona Physical Machine may protect your Domain when a serious disaster on your Virtual Environment occurs. I run 3 different Windows Domains and I follow the same pattern on each domain. Keep one ADDC on a physical machine and the rest on Virtual.

b4real
b4real

I do it in lab configurations. Very strategic setup with reservations are an option, even in production. Though I've never done it.

doug.montgomery
doug.montgomery

It could also be said that assigning any static addresses besides your dns servers illogical.

david.thomas
david.thomas

For production environments, we hold ADDCs as the highest security level for the Wintel and Lintel fleet. This is because all UNIX and Windows hosts use AD as their authentication/authorisation platform and thefore, AD is the most critical item in the security profile. Hence, production ADDCs are physical, physically secured, and run at the higehst security level we can provide. Oh.. and Virtualising a DC is out of the question due to security issues.

b4real
b4real

I disable time sync with host, have hosts sync to NTP, have domain sync to same NTP.

samuel.thomas
samuel.thomas

I had this same issue a year or so ago with a virtual DC. I solved the issue by unchecking the box in vmware tools that says to sync the time with the host and let the windows ntp client do the work instead. Didn't have any more issues after that.

pgit
pgit

You've run ntp in the virtual environment and still had this problem?

doug.montgomery
doug.montgomery

I see your need for a physical, but aren't you back to a single point of failure? Can you start the virtual server without a domain controller/DNS available? Servers are so inexpensive these days, I dont understand the advantage to virutalize this critical piece of the network.

coldbrew
coldbrew

We had this issue happen to us about a year and a half ago. We had a power surge or lightning attack. This shut down the SAN and the host, which booted from the SAn, could not locate it to come back up. We were all new to vmware/vsphere so we didn't realize this. We then stood up a dc on a physical server. Nothing major, just something to have out there in case the vcenter was down. Hard lesson learned.

b4real
b4real

Really depends on how "on-board" we are with centralized management.

b4real
b4real

Rounds out the stack from one approach.