Developer

Win2K high availability: NLB outweighs DNS round robin

Network load balancing and DNS round robin both distribute traffic to groups of similar servers. However, NLB delivers a number of important benefits that DNS round robin lacks. Take a look at the differences between the two approaches.


When you're looking at employing high availability for important services, you need to be clear about the similarities and differences between network load balancing (NLB) and DNS round robin (RR). Both are methods for distributing client network traffic to servers running the same service, but there are some important differences in how they work in Windows 2000. Microsoft’s NLB requires Windows 2000 Advanced Server and is a separate Win2K network component that needs to be configured, backed up, and maintained. Windows 2000 DNS supports RR by default, is easy to configure, and requires only Windows 2000 Server.

Several key features make NLB more sophisticated than DNS RR, including:
  • Monitoring availability
  • Accommodating session state
  • Port awareness
  • Load settings
  • No need to worry about client caching
  • No need to worry about subnet prioritization

Let's take a closer look at each of these issues.

Monitoring
A DNS server cannot verify the availability of hosts in its records. Thus, it might easily hand out the IP address of a server that is switched off or one that's up but has a crashed service.

Conversely, NLB constantly monitors the network availability of servers in its cluster, so it will not direct traffic to a server that is switched off. However, NLB will continue to direct traffic to an online server with a crashed service. Unlike Microsoft’s other clustering technology, the Cluster Service, NLB does not monitor at the resource level.

Session state
DNS servers don't keep track of the addresses of requesting clients, which means they can't tie a client to the same server, if necessary. This is typically required for maintaining session state, such as when a user shops on a public Web site and saves items in a shopping cart (on the Web server the user is currently connected to). If the client reconnects with the same name but to another server, the shopping cart will appear to be empty.

Another example involves internal servers, where users connect via Terminal Services. If a user disconnects instead of logging off, applications will continue to run on the original server.

NLB can be configured with or without this kind of client affinity, so it can maintain session state when necessary. With NLB, you can actually specify two types of client affinity. The first type, Single Affinity, matches the whole client address to ensure that the same client will always be reconnected to the same server. This is the safest option to use when you need to maintain session state but the least efficient for load balancing.

The second type, the Class C option, allows you to compromise between no affinity and single affinity. The Class C option looks to match the first three octets of the connecting address only. When these match, clients from the same subnet will always be reconnected to the same server. When they don’t match, the connection will be treated with no affinity (any server can be used).

Port awareness
DNS servers work at the network level only. They are totally unaware of what services (ports) will be used. NLB allows for granular behavior at the transport level, with the ability to load-balance only on specific ports. You can specify TCP ports, UDP ports, or both. You can also specify either all port numbers or specific port ranges. For example, for Web traffic only, you would specify TCP ports 80 and 443 (SSL). Specific ports can also be blocked (disabled), thus acting as a simple filtering/firewall mechanism.

Load
Not only can you define load balancing at the port level, but you can also define the load weight of traffic between servers. In other words, you can specify what proportion of the identified traffic should go to each server. The default is Equal, but it would be useful to set a different value if your servers have asymmetric hardware or if you’re load balancing different services. For example, you might have four clustered servers all load balancing Web traffic but only two load balancing FTP traffic.

Client caching
It may come as a surprise to learn that client caching can foil your DNS round robin plans when your clients are running Windows 2000 and Windows XP. Both of these OSs include an automatic client DNS cache that can make it appear that round robin is not working. Just as DNS servers honor a host record’s Time To Live (TTL) value to reduce network traffic between communicating servers, so do Microsoft DNS clients honor this timeout value and use their cached response (if it exists) for reconnections. This means that they may not ask the DNS server to resolve the host name again. As a result, the DNS server can't give out a different order of addresses.

You can view the contents of the DNS client cache with the command ipconfig /displaydns to determine whether this is the reason your client is reconnecting to the same server and not a different server when using DNS round robin. Similarly, you can use the ipconfig /flushdns command to empty the cache and force the client to request the address from a DNS server, but obviously this is practical only in a testing environment.

In production use, the only way around the client DNS cache is to either reduce the TTL value on the host record, if possible—but at the expense of more server-to-server traffic—or reduce the client DNS cache by setting the MaxCacheEntryTtlLimit registry value of each client to the minimum of 1, which effectively disables the cache. Obviously, editing each client registry may not be possible and involves considerable administrative overhead. It will, of course, affect all client DNS requests.

You can find the MaxCacheEntryTtlLimit key under:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Dnscache
\Parameters


For more information, see "How to Disable Client-Side DNS Caching in Windows" (Q245437).

Subnet prioritization
Another setting that can play havoc with DNS round robin load balancing is subnet prioritization on either the server or client. This feature is supported on Windows 2000 DNS servers and the latest clients, also with the aim of reducing network traffic. What happens is that the natural rotation of addresses you would normally see with round robin is skewed in favor of keeping traffic local.

For example, when subnet prioritization is used on a DNS server, an address that is on the same subnet as the requesting client is returned first, before the other addresses. Similarly, when subnet prioritization is used on a DNS client, the client will sort through the addresses returned by the DNS server and look to use one that is on the same subnet as itself, irrespective of the order in which they were returned from the DNS server.

You can disable this default behavior if it affects your systems. On the Windows 2000 DNS server, it is a server Advanced property: Enable Netmask Ordering. On the client, you’ll have to edit the registry and add a new value, PrioritizeRecordData=0, under:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Dnscache
\Parameters


For more information,  see "New Registry Value to Disable DNR Local Network Priority Sorting" (Q196500).

Summary
Until you begin to examine Microsoft’s NLB options, you may not appreciate just how sophisticated this service can be and how much it can compensate for some troublesome DNS round robin issues. NLB far outweighs DNS round robin in many respects, even though it does require Windows 2000 Advanced Server and some additional configuration.

Editor's Picks