Data Centers

Understanding server load balancing

Getting your servers up to optimal stability and reliability is the most important issue of network administrators. Load balancing is one method of achieving a higher degree of efficiency, and Deb Shinder helps you to understand this concept.

Load balancing is a favorite buzzword (or in this case, buzz phrase) among IT professionals at the enterprise level and a favorite “feature” for selling new technologies. Nonetheless, many network administrators don’t really understand what it is and how it works. In this Daily Drill Down, I will provide an overview of how load balancing can increase the efficiency of your network servers and discuss some of the options available for implementing load balancing on your network.

What is load balancing?
The concept of load balancing is a simple one: spreading the work that a computer needs to do across multiple machines. However, the implementation of this idea can be quite complex. A number of vendors offer load balancing solutions that are implemented in different ways. You may have heard about Windows Load Balancing Service (WLBS), Network Load Balancing (NLB), Component Load Balancing (CLB), and other similar terms. In this Daily Drill Down, I will address the broad term server load balancing, which can encompass all of the above and more.

Load balancing and server clustering
One way to distribute the workload is to use server clustering. A server cluster consists of two or more servers that operate and are managed as if they were a single entity. The servers must be able to access one another’s disk data. Special software (such as MSCS, Microsoft Cluster Server) is used to manage the systems, automatically detect the failure of one system, and provide failover/recovery.

Server clusters are sometimes called server farms. In some implementations, the servers have individual operating systems, while in others they share an operating system. Large Web sites such as Yahoo use multiple servers in a Web farm to handle the huge volume of traffic.

Hardware vs. software implementations
Operating systems such as Windows NT/2000 and Red Hat’s High Availability Linux Server provide software-based load balancing, and there are also software packages such as Resonate. Many vendors also make hardware devices based on switching technology that include load balancing functionality.

Load balancing switches and routers, such as those made by Cisco, Radware, Foundary, Alteon, and other vendors, use a variety of algorithms to distribute TCP/IP requests among a group of servers. Load balancing switches are also often referred to as content switches and content directors.

Early load balancing solutions used a DNS round-robin algorithm; more recent methods include least connections and fastest response algorithms.
For links to resources and vendors for both software and hardware load balancing solutions, as well as numerous articles on specific implementations, see

Advantages of server load balancing and clustering
Load balancing and clustering are part of High Availability (HA) strategy. Having two or more computers handle the workload increases performance (speeding up the process), and the redundancy also provides fault tolerance; if one of the machines goes down, the other can still continue to function. Clients see the group of servers as a single virtual server, with one IP address.

There are three big advantages of clustering servers to provide load balancing:
  • Easier and more flexible management: With clustering software, administrators can move the workload onto particular servers within a cluster (for example, to update a server without impacting accessibility of data and services to clients).
  • Uninterrupted availability and fault tolerance: If a server fails, clustering software detects the failure and fails over to a remaining server.
  • Better scalability: Load balancing can be scaled across multiple servers in a cluster. Applications that are written to run on server clusters can perform dynamic load balancing.

Load balancing and server clustering technologies are important to enterprise-level networks because of the mission-critical nature of servers such as those that provide a Web presence (and often, secure transaction services and database access) on the Internet or those that provide applications and data on the corporate intranet. Load balancing ensures high availability and little or no downtime for Web, proxy, terminal, and VPN servers.

Load balancing servers in a cluster allow companies to scale their network services in conjunction with rapid growth so that additional servers can be added to the cluster as network traffic increases. Load balancing is usually implemented in conjunction with server clustering. A load balancing cluster distributes the load of incoming TCP/IP traffic, while a server cluster provides fault tolerance.

How does server clustering work?
The servers that are members of a cluster are called hosts or nodes (depending on the vendor of the clustering technology). The cluster members are physically connected via network cables and programmatically connected via the clustering software.

The clustering software provides:
  • A means by which the cluster members can have common access to disk data.
  • A means of detecting when a server or application fails.
  • A means of recovering from a failure by shifting the work to the remaining server(s) or restarting the application.
  • An interface through which the servers in the cluster can be managed as one entity, presenting a “single system image.”

It is also useful if the cluster administration software allows you to remotely manage the server cluster.

Sharing disk data access between servers
There are several different methods that can be used to allow more than one server to have access to disk data. These include:
  • Shared disk method
  • Mirrored disk method
  • “Shared nothing” method

The shared disk method was used with the first implementations of server clustering. Software called Distributed Lock Manager (DLM) was used to give all servers in the cluster access to all physical disks. Shared disk clustering requires SCSI disks (or special cabling and switches) and applications that are modified to be aware of the disk sharing. Oracle’s Parallel Server uses shared disks.

With the mirrored disk method, each of the servers in the cluster has its own disks. The data is mirrored (an exact copy is written) to the disks on other servers. This requires special software such as that made by Veritas, NSI, and Octopus.

“Shared nothing” is a clustering method in which each server has its own disk resources. The clustering software transfers the ownership of a disk from one server to another if the server that owns the disk fails. An advantage of this method is that applications do not have to be modified. Microsoft Cluster Services (MSCS) uses this method.

Detecting server or application failure
When server clusters are used to provide fault tolerance, there must be a way for the cluster to detect when one of its member servers fails (or when an application on the cluster fails). One way, used by Microsoft in their clustering solutions, is with software “heartbeats”—messages that are sent on a regular basis between nodes. If a server fails, it will cease to emit the periodic heartbeat message, and the software will redistribute the workload among the remaining servers.

Understanding failover and failback
When a failure is detected, recovery involves failover, which is the transition that occurs when a server fails and another server(s) picks up its load. In many cases, this transition is transparent to the clients, as the applications, file shares, and other resources are restarted by the clustering software at the same IP address. If the client is browsing the Web or using some other “stateless” connection type, the user may not be aware of the failure at all. The application automatically reconnects after a failover. With some client applications, the user may receive a message that the server is unavailable and may be required to log back on. When the failed server comes back online, clustering software detects its presence and allows it to automatically rejoin the cluster.

After the failed server rejoins the cluster, failback is the process that automatically redistributes the workload again to include the newly rejoined server.

Balancing the load
Depending upon the load balancing implementations, administrators may be able to specify how much of the load each host should handle (the weight) or spread the load equally among all hosts in the load balancing cluster.

Load balancing can be integrated with other network services such as network address translation (NAT). The load sharing network address translation (LSNAT) technology allows for a router to intercept client requests directed to a server and select a node in the server pool to which the request will be sent, based on the load sharing algorithm.

RFC 2391
Load sharing using NAT (LSNAT) is discussed in RFC 2391.

Hardware load balancing devices tend to be expensive; you may need to purchase two devices to avoid the problem of having a single point of failure, with the second device remaining passive unless a failure occurs. Software solutions may be based on a “dispatcher” model in which all incoming requests go through one server, the “dispatcher server,” and are then distributed to other servers in the cluster. Other software solutions are fully distributed, avoiding the bottleneck that can result from the dispatcher model.

Single system image
The single system image is the user interface that allows administrators to manage all of the cluster resources from a central location.

For example, when you install and configure NLB on a Windows 2000 computer, you can use the cluster control utility Wlbs.exe (located in the <systemroot>\System32 folder) to modify load balancing parameters (see Figure A).

Figure A
Windows 2000 includes the cluster control utility for managing load balancing clusters.

The cluster control utility can be run on the host servers that are members of the cluster or on any other Windows 2000 machine that has access to the cluster over the network (if remote control is enabled).

Enabling remote control
If you enable remote control for your W2K load balancing cluster, you should set a password to prevent unauthorized users from gaining access to the cluster. Do this for each host in its NLB properties. You should also use a firewall to protect the UDP control ports 1717 and 2504 on the cluster IP address, which receive the remote control commands.

Configuration changes to a load balancing server can also be made using the graphical interface on the Network Load Balancing (NLB) Properties sheet, as shown in Figure B.

Figure B
Use the NLB Properties sheet to configure Windows 2000 load balancing servers.

To access the NLB Properties sheet of a Windows 2000 Server, select Settings | Network And Dialup Connections | Local Area Connection | Properties, select Network Load Balancing in the list of installed network components, and then click the Properties button.

Windows Network Load Balancing
Microsoft offered the Windows Load Balancing Service (WLBS) for Windows NT 4.0 Enterprise Edition to work in conjunction with MSCS. Windows 2000 Advanced Server includes clustering and load balancing technology, now called Network Load Balancing (NLB).

Windows NT Load Balancing Service
WSLB is downloadable from the Microsoft Web site. It allows load balancing clusters of up to 32 Windows NT servers, using a distributed algorithm to map the workload between cluster nodes.

How NLB works in Windows 2000
NLB operates as a networking driver in Windows 2000. Microsoft recommends that load balancing servers be configured with two network interface cards (NICs). Then NLB can be configured to use one NIC for client-to-cluster traffic, and other network traffic can be handled by the other adapter. (This is a best practice for highest performance; however, NLB can be implemented with only one NIC.)

One or more virtual IP addresses are assigned to the NLB cluster. All the hosts in the cluster can detect network traffic that is addressed to the cluster’s primary IP address. (Each host also has a dedicated IP address that is unique to that host, which is used for network traffic that is not associated with the cluster.) The NLB driver on each host allows that host to receive a portion of the incoming cluster traffic.

NLB maps the incoming clients to cluster hosts based on IP address, port, and other information. The NLB filtering algorithm that runs on each host examines incoming TCP/IP packets to determine which cluster host will handle each packet.

NLB can operate in either unicast or multicast mode. Unicast mode uses the cluster’s Media Access Control (MAC) address. The MAC address of the host computer’s network adapter is not used. In multicast mode, both the cluster MAC address and the NIC’s built-in MAC address are used. The cluster MAC address is used for client-to-cluster traffic, and the NIC’s MAC address is used for other network traffic destined for the individual host, not for the cluster.

Installing and configuring Network Load Balancing in Windows 2000 Advanced Server
NLB is installed as a networking component, using the Properties dialog box of the server’s local area connection (see Figure C).

Figure C
NLB is installed as a network component on the server’s local area connection.

By default, NLB will be listed in the components list, but its check box will not be checked. To use and configure it, check the check box, highlight the selection, and click the Properties button.

Installing load balancing
If Network Load Balancing has been uninstalled, you can install it by clicking the Install button and selecting it from the Service list. You may be prompted to insert the Windows 2000 Advanced Server CD or enter a network path to the installation files.

Configuration of NLB involves first setting the cluster parameters on the first tab of the NLB properties sheet:
  • Install a second NIC, if desired.
  • Provide the cluster’s primary IP address in dotted quad format. This is a virtual IP address that is the same for all hosts in the cluster.
  • Provide the subnet mask for the cluster IP address.
  • Provide a cluster name, which will be the same for all hosts in the cluster. This is a fully qualified domain name (, for example), and it must be resolved to the cluster’s primary IP address via your DNS server or hosts file.
  • A cluster IP address (the MAC address of the NIC that will handle client-to-cluster traffic) will be automatically determined by NLB, based on the primary IP address.
  • Decide whether to enable multicast support. All hosts must operate in the same mode (unicast or multicast). By default, multicast is not enabled.
  • Set and confirm a remote password, used to restrict remote access to the cluster via the Wlbs cluster control utility.
  • Specify whether remote control is allowed. By default, it is disabled.

The second tab of the NLB properties sheet, used to set host parameters, is shown in Figure D.

Figure D
The second step in configuring NLB is to set the host parameters.

To set the host parameters, perform the following:
  • Set a priority ID that defines the host’s unique priority for handling any traffic for TCP and UDP ports that aren’t covered by port rules. (Values are 1 to <number of hosts>, and 1 is the highest priority.) Each host in the cluster must have a different priority setting.
  • Set the initial cluster state for this host. If the Active box is checked, NLB will start and the host will join the cluster when Windows 2000 is started. The box is checked by default; if you do not want NLB to start when the operating system loads, uncheck it.
  • Provide the dedicated IP address that is unique to this host.
  • Provide the subnet mask for the dedicated IP address.

The last tab on the NLB properties sheet is used to set port rules, as shown in Figure E.

Figure E
The last step in configuring NLB is to specify port rules.

Port rules are used to control how the cluster traffic for each port is handled. You can choose from three filtering methods:
  • Multiple hosts: The traffic for this rule will be handled by multiple hosts in the cluster. You can assign a load weight or you can choose to distribute the traffic equally among the hosts.
  • Single host: Traffic for the ports designated by this rule will be handled by one host, determined by the handling priority.
  • Disabled: Network traffic for the port(s) associated with this rule will be blocked. This allows you to keep out unwanted traffic.

The affinity setting allows you to determine whether multiple requests from the same client should be sent to the same cluster host. The None setting indicates no client affinity (that is, multiple requests from a client need not be sent to the same server). Single is used for client affinity; NLB will send multiple requests from the same client to the same server. The Class C setting allows you to direct all client requests from an entire Class C address range to the same cluster host. Use this when you need client affinity and you have clients that use multiple proxy servers to access the cluster (which could make it appear as if requests from the same client were coming from different computers). The default setting is Single.

Disabling affinity by using the None option improves performance but you’ll need to enable affinity if the server host uses cookies or other session state information between connections.

The settings you choose in the NLB Properties sheets will be recorded in the Windows 2000 Registry when you click OK.

After you configure NLB properties, you must set up TCP/IP for Network Load Balancing. To do so, ensure that the dedicated IP address set in NLB properties is set as the IP address in the TCP/IP properties sheet and that the cluster IP address is added under Advanced TCP/IP properties. If the cluster uses additional virtual IP addresses (for example, a multihomed Web server), these should also be entered in the Advanced settings.

Both IP addresses (dedicated and cluster’s primary address) must be static addresses; you cannot have them assigned via DHCP.

Server load balancing provides a way to improve performance for high traffic server functions and, when used in conjunction with server clustering, it provides fault tolerance for your mission-critical network services. In this Daily Drill Down, I have provided an overview of what load balancing and server clustering are, how they work, and how to configure a Windows 2000 Advanced Server to use NLB.

About Deb Shinder

Debra Littlejohn Shinder, MCSE, MVP is a technology consultant, trainer, and writer who has authored a number of books on computer operating systems, networking, and security. Deb is a tech editor, developmental editor, and contributor to over 20 add...

Editor's Picks

Free Newsletters, In your Inbox