One of the biggest drawbacks to Active Directory is its distributed nature. Whenever you make an update to Active Directory, your change is added to the Active Directory database on a domain controller. The domain controller must then replicate the change to all of the other domain controllers in the domain. This replication process results in more traffic on the network. The more domain controllers that are in the domain, the more replication-related traffic you can expect on your network. Fortunately, there are some things you can do to get a handle on replication-related traffic.
Under normal conditions, when you make an update to Active Directory, Windows updates the other domain controllers with the change almost immediately. Windows assumes that if the domain controllers are in the same domain, they are probably well connected and, therefore, there is little concern about bandwidth or replication speed. Replication occurs on an as-needed basis.
In some organizations, though, random replication is unacceptable because of bandwidth limitations. In such environments, it’s necessary to minimize replication-related traffic by implementing a replication schedule. Unfortunately, you can’t apply a replication schedule directly to a domain. You must divide the domain into sites and then schedule replication between the sites.
Designing a site structure
The first step to optimizing replication traffic is to develop an effective site structure. Microsoft recommends dividing your organization into sites in a way that mimics the subnetting scheme. For example, assuming you buy in to the Microsoft philosophy of site design, if you have five subnets, then you should have five sites as well.
Generally, I agree with this particular design philosophy. The only problem with it is that not everyone has his or her network subnetted. For example, when I managed the computer network for Fort Knox, KY, the network consisted of 25,000 computers spread out over hundreds of buildings and dozens of square miles. The problem was that the person who originally designed the network implemented it as a flat network, meaning that the entire network consisted of a single subnet. Even when I left the Army in 1999, the constantly growing network still had only a single subnet. According to the Microsoft design philosophy, this network should have used only a single site because there was a single subnet. Having a single site over such a large network would cause some performance problems. In case you’re wondering, the network did suffer from constant performance-related problems as a result of its design.
I give you this example to illustrate the point that linking site structure to subnet structure isn’t a bad idea. However, the only way to get the performance gains associated with site segregation is to ensure that your network is effectively divided into subnets.
Subnetting your network
Since this is an article about optimizing replication traffic, I don’t want to waste a lot of time talking about subnetting, but I do want to briefly address it. There’s really no right or wrong way to subnet your network. When I build a network, I usually subnet based on geography and groups of users.
Subnetting prevents packets from one network segment from flowing into another network segment. For example, suppose for a moment that you had two subnets, A and B. If a user in subnet A sent a packet to another user in subnet A, then the router that joins the two subnets would prevent the traffic from flowing onto subnet B. This means that there is less traffic on subnet B than there would be if the two segments were not separated by subnets.
Since the whole purpose of subnetting is traffic control, let’s look at the areas of your network that need traffic control. As I said earlier, I usually begin subnetting a network based on geography. If your company has an office in Miami, another one in Las Vegas, and another in New Orleans, then it’s obvious that each office will need a separate network and therefore an individual subnet for that network. However, this same principle of geographic subnetting also applies on the smaller scale.
Subnetting makes sense, especially in situations where facilities are separated by a slow WAN link; it also makes sense within an individual building. For example, suppose you had a couple of thousand users working in a single building. You could implement some subnets in a way that would reduce broadcast traffic across the network. One way of accomplishing this is to subnet by department. Suppose that the finance department had its own file and print servers. You could create a subnet specifically for the finance department. By doing so, none of the broadcast traffic generated within the finance department would be passed to the rest of the building unless it was specifically intended for another subnet.
Subnets and sites
Now that you know a little about the benefits of subnetting, let’s take a look at how Active Directory sites play into the equation. While subnets are a mechanism used to divide the network into smaller components, sites are an Active Directory-level mechanism used to split a domain into smaller pieces.
To understand the benefits of a site, let’s take a look at an example of a company that was forced to expand its operations into a warehouse across town. Obviously, Windows 2000 didn’t exist in the early ’90s and neither did sites. However, domains did exist.
At the time, the company was using Windows NT 3.51 and had a single domain that covered the main office and the satellite office. Windows NT uses a PDC/BDC domain model in which all domain controller updates are applied to the PDC and distributed out to the BDCs by the PDC. In Windows 2000, though, a multimaster domain model is used instead. This means that when a change is made to Active Directory, the change is applied to the closest domain controller and then replicated to all other domain controllers.
To realize the benefits of sites, let’s pretend that this company was running Windows 2000 instead of Windows NT, and that the organization is divided into two subnets. In an arrangement like this, if someone in the main office made an update to Active Directory, the domain controller that accepted the update would transmit the update to all other domain controllers. If the satellite office had five domain controllers, there would be five different sets of replication traffic flowing across the slow WAN link so that those domain controllers could be updated.
Sure, in this particular model, the subnetting prevents unnecessary traffic from flowing across the WAN link, but remember that the replication traffic is specifically intended for targets on the other office. This means that all five sets of replication packets would flow across the WAN link. Of course, this example illustrates a single update. If 10 different Active Directory objects were updated, there would be 50 bursts of replication packets flowing across the WAN link (10 sets of packets for each of the five domain controllers at the satellite office).
This is where sites come into play. Sites do two main things for Windows. First, they allow you to schedule replication rather than having it occur on an as-needed basis. Second, they condense replication traffic into single sets.
To see why scheduling replication-related traffic is important, consider this: If someone at the satellite office were to make an update to Active Directory, the update would most likely be related to an Active Directory object that applies directly to that office, such as a password change for a user in that office.
If a user were to change his or her password, it would be necessary for the password change to eventually be replicated to every domain controller in the domain. However, it would be even more important for the change to be quickly replicated to the domain controllers that are most likely to authenticate the user’s request. Therefore, if a user changes his or her password, it’s much more important for a domain controller in the user’s own office to be updated with the new password than it is for some domain controller in another office to be updated, although all domain controllers need to be updated eventually.
Sites make this selective updating possible. For example, suppose you established a site replication schedule of 15 minutes. If a user in one site were to change his or her password, then the domain controllers in the user’s local site would be updated immediately. But it could be up to 15 minutes before domain controllers in remote sites were updated. During this time, the site’s bridgehead server would collect domain controller updates and transmit them all at once at the end of the replication cycle.
Earlier, I said that a benefit to using sites was that you could reduce the number of replication-related packets flowing across the network. The reason for this is the fact that each site has a designated bridgehead server. A bridgehead server is a server whose job it is to send and receive replication traffic to and from other sites.
If the remote office had five domain controllers, then five sets of replication traffic would flow to the remote office for every Active Directory update. This isn’t the case when sites are used, though. When sites are used, a single set of replication traffic flows between the two sites. Rather than trying to update every single domain controller, the local site simply sends the updates to the remote site’s bridgehead server. It’s then up to the bridgehead server to distribute the traffic to the domain controllers within the remote site. In this particular example, rather than having to send updates to five individual domain controllers, updates are sent to a single bridgehead server. That means an 80 percent reduction in replication-related traffic across the WAN link.
Setting site replication frequencies
I hope that I’ve to convince you of the benefits associated with implementing sites in your network. The actual techniques for creating sites are beyond the scope of this article, but I do want to take the time to discuss a couple of the parameters and concepts that you’ll have to configure when setting up sites.
The first concept that you need to understand is that of a site connector. A site connector is a logical connector that tells the replication-related traffic how to flow between the two sites. You must have at least one site connector connecting any two sites, although it is possible to use redundant or transitive site connectors. For example, if you had three sites, A, B, and C, at a minimum, you’d need two site connectors, AB and BC. However, you could have site connectors for AB, AC, and BC. You could also have multiple instances of each site connector if multiple physical data paths were available between the sites. An example of this is environments that are connected by a broadband link but also have a dial-up link that’s used in emergency situations.
Although there are a lot of configuration options related to site connectors, the two most important are the cost and the replication frequency. The cost is simply a numeric value that Windows uses to determine which site connector to use. If a site had two site connectors to another site, and one had a cost of 1 and the other had a cost of 2, the connector with the lowest cost would always be used. The higher cost connection would be used only if the lower cost connection was unavailable.
The replication frequency parameter controls the amount of time between replication updates. The minimum value is 15 minutes, and the maximum is 10,080 minutes, or one week.
There’s really no right or wrong way to set the replication frequency. The main point to remember is that longer replication frequencies mean better performance but fewer updates. Lower replication frequencies mean a more consistent Active Directory database, but they also mean you will be sacrificing some bandwidth to get those frequent updates. My personal thoughts are that on smaller networks, 15-minute replication cycles are acceptable. On larger networks, I usually use 30-minute replication cycles. Of course, these are just guidelines. In the real world, you must look at a company’s operational needs and use those to determine the ideal replication frequency.