SolutionBase: Troubleshooting Active Directory performance problems

Active Directory lies at the heart of your Windows-based network. As such, you want to make sure that it's running at peak efficiency. In this article, Brien Posey shows you how to troubleshoot common Active Directory performance problems.

Every article that I can ever remember reading (or writing for that matter) in relation to troubleshooting the Active Directory has had to do with troubleshooting functionality. Those articles assume that Active Directory is either working, or it isn't. While information on troubleshooting Active Directory related functionality issues certainly has its place, I have noticed an absence of material related to troubleshooting Active Directory performance problems.

It's important for the Active Directory to not only function, but also to perform efficiently. Virtually every aspect of Windows Networking involves the use of the Active Directory in some way. If the Active Directory is having trouble keeping pace with the demands that the network is placing on it, then there can be some very noticeable consequences. For example, users might experience extremely slow logins. Likewise, users may have trouble displaying the Exchange Global Address List. In this article series, I will show you some techniques that you can use to determine the nature of your Active Directory's performance problem and what you can do to correct it.

Initial diagnostic testing

Even if everything appears to be working correctly, I recommend taking some time to verify that your Active Directory is indeed completely functional before you start studying Performance Monitor counters. There are several ways that you can accomplish this, but I recommend that you start out by running DCDIAG on each of your domain controllers.

As the name implies, DCDIAG is a domain controller diagnostic utility that is included with Windows Server 2003. To use DCDIAG, simply open a Command Prompt window and enter DCDIAG at the Command Prompt. DCDIAG will then run a series of tests to make sure that the domain controller is functional. Keep in mind, that DCDIAG isn't going to do anything to check to see whether or not the domain controller is running efficiently, it is just going to run some quick diagnostics to make sure that the domain controller has at least a minimal level of functionality.

Running DCDIAG is a really quick process. The entire series of tests only take a few seconds. The testing results look something like what you see in Figure A. If DCDIAG does produce any errors though, then you should fix the cause of those errors before you attempt to track down any inefficiencies.

Figure A

This is what it looks like when you run DCDIAG.

After running DCDIAG, I recommend that you run some quick replication tests. DCDIAG does perform a replication test of its own, but as I mentioned before, the level of testing that DCDIAG performs is minimal. DCDIAG's job is to test basic functionality, not to perform an exhaustive series of tests against every domain controller in your forest.

You might be wondering why I am recommending performing a replication test when we have already used DCDIAG to confirm that the domain controller is working properly. The reason for this is because many performance problems are actually replication based. I therefore think that it makes sense to take a quick look to make sure that replication is completely functional before digging into other potential causes of performance problems.

There are several different tools that you can use to diagnose replication related problems, but probably the easiest one to use is the Active Directory Replication Monitor. To launch the Active Directory Replication Monitor, simply enter the REPLMON command at the Run prompt. Doing so will cause Windows to open an empty Active Directory Replication Monitor console.

To use the Active Directory Replication Monitor, you must begin by selecting a server to monitor. To do so, select the Add Monitored Server command from the console's Edit menu. This will cause the Active Directory Replication Monitor to launch the Add Monitored Server Wizard.

The wizard's first screen gives you a choice of either adding a domain controller specifically by name or of searching the Active Directory for domain controllers to add. I'm assuming that you know the names of your domain controllers, so select the Add the Server Explicitly By Name option and click Next.

At this point, you will be asked to enter the name of the server that you want to monitor. This screen also has a check box that you can select if you should need to provide the Active Directory Replication Monitor with an alternate set of credentials.

After you enter the name of the server that you want to monitor, you will see a screen that looks something like the one that you see in Figure B. If you look at the figure, you will notice that although you are looking at a single domain controller, it is displayed in a hierarchical manner with regard to the site that it is located in. You will also notice that there are three separate containers listed beneath the domain controller. These three containers correspond to the three default partitions that make up the Active Directory; the domain partition, the configuration partition, and the schema partition.

Figure B

The domain controller is listed beneath the site that it is a member of.

In almost every environment, the Active Directory contains more than one domain controller. Although an Active Directory Performance problem may ultimately be linked to one individual domain controller, you must initially examine the Active Directory as a whole in order to get a clear picture of its health. That being the case, I recommend using the Add Monitored Server command on the console's Edit menu to add the rest of your domain controllers to the console.

When you add the other domain controllers to the console, the console will look something like what you see in Figure C. I only have two domain controllers on my test network at the moment, but I wanted to show you what it looks like to have multiple domain controllers listed in the console anyway.

Figure C

This is what the Replication Monitor looks like when it is set to display multiple domain controllers.

My reason for showing you this figure is because I wanted to point out that not all domain controllers will always be displayed in the same manner. You will notice that the domain controller FUBAR has two additional partitions listed beneath it. The reason why these extra partitions are displayed is because the server is acting as both a domain controller and as a DNS server with Active Directory integrated DNS zones. It's the Active Directory integrated zones that cause the extra partitions to be displayed. Sometimes, an application might require a special application partition to be created in the Active Directory. Although I have never seen it personally, I have heard that the Active Directory Replication Monitor will display those partitions too.

The point is that as you examine your domain controllers, there should always be at least three partitions, but there might be more. The Replication Monitor can actually perform a lot of Replication related diagnostic tasks, but for the purposes of this article, we just need to make sure that replication is functional. The easiest way to accomplish this is to expand each partition listed within the Replication Monitor and select the link that corresponds to the partition's replication partner, as shown in Figure D.

Figure D

This is what the Replication Monitor looks like when replication is functioning properly.

As you can see in the figure, you should receive information telling you which USN (Update Sequence Number) that the server is current through. You should also see a message telling you when the most recent replication attempt occurred and whether or not that attempt was successful. Normally, if replication is functioning correctly, then there should be a successful replication event within the last few minutes.

In case you are wondering what a replication failure looks like, you can see an example in Figure E. In the figure, you will notice that there is a red X over top of the replication partner's icon. The pane to the right also contains information regarding the nature of the failure. In this particular case, the failure occurred because I purposely took a domain controller offline.

Figure E

This is what a replication failure looks like.

If a replication problem is occurring

Hopefully, this test has proven that replication is functioning properly. If you do detect a replication problem though, it is important that you correct the problem before you even think about trying to sort out your Active Directory performance problems. There is a very good chance that the performance problems that you are having are directly related to the replication problems that you have detected.

I don't want to turn this into an article on repairing replication problems, but I do want to at least give you a hint as to where to start looking for replication problems. The most common causes of replication problems involve the server being down, a communications failure, or insufficient hard disk space on the server. Therefore, I recommend making sure that the server in question is up and that all of the critical services are running (pay special attention to RPC related services).

Likewise, you should verify the server's network connectivity. You need to make sure that the server can communicate with the other domain controllers on your network and that DNS resolution is working correctly. The malfunctioning server needs to be able to resolve the names of the functioning servers and visa versa. Of course these are just the most common causes of replication failures. If you are having replication problems and the cause isn't obvious, then I recommend checking the server's Event logs and then cross referencing any Event IDs found in the log with the article on Microsoft's TechNet.

DNS server performance

The last topic that I want to talk about in this article is your DNS server's performance. As I'm sure you probably know, the Active Directory is completely dependent on the DNS services. Without DNS there is no Active Directory. This is an important consideration when you are trying to diagnose Active Directory performance problems because if DNS response is slow, then Active Directory related functions may also be slow as a result.

The easiest way to tell whether or not your DNS server is keeping pace with the demands being placed on it is to use the Performance Monitor to analyze some DNS related counters. Windows Server 2003 does not contain DNS related counters by default, but if the server is acting as a DNS server, then DNS related counters are installed at the time that DNS is set up.

In an upcoming article, we are going to be using the Performance Monitor extensively, and there are a lot of best practices that you will probably want to use in order to make sure that your results are as accurate as possible. The best practices that I am talking about will be a lot more important later than they are now, so for now, just run the Performance Monitor locally on your DNS server to analyze the counter values.

The first two counters that you should examine are the Total Query Received/Sec and the Total Response Sent/Sec counters. These counters measure the number of DNS queries received and the total number of responses sent back to the clients each second. The values of these two counters should roughly mimic each other. Ideally, you would want the counters to read exactly the same at all times, but that isn't realistic due to the latencies involved in external queries. It's OK if the responses are slightly delayed, but you want to make sure that the DNS server isn't being so overrun with queries that it is taking forever to get responses back. Figure F shows a healthy (although not overly active) DNS server.

Figure F

DNS queries and responses should not be dramatically out of balance.

If you do determine that your DNS server isn't able to keep up with the queries that it is receiving, then you are going to have to make a decision as to whether you want to set up some subordinate DNS servers or try to squeeze more performance out of your existing DNS server.

If you only have one DNS server, then it is never a bad idea to put in a second DNS server in case the primary DNS server was to fail. Fault tolerant issues aside though, if it comes down to having to make a choice between adding subordinate DNS servers and increasing the performance of your existing DNS server, the correct decision is going to vary depending on your network's design.

For example, if your network contains two or more sites (which usually reflects the presence of one or more WAN links), then it is often a good idea for each site to have its own DNS server. Doing so helps the users in each site to receive faster responses, and it keeps DNS queries from clogging up the WAN link. Furthermore, if each site has its own DNS server, then users in a remote site should be able to retain at least some level of functionality even if the WAN link goes down.

On the other hand, if your organization is relatively small and there are no WAN links involved, then it might be better to just upgrade your existing DNS server. There's nothing wrong with adding an additional DNS server, but smaller companies typically lack the budget to add additional servers on a whim. If you do decide that upgrading your DNS server is the best way to restore its performance, then you will have to do some additional performance monitoring to determine where the server's bottlenecks are.