MSCS vs. NLB: Evaluating the pros and cons

When it comes to high availability clustering solutions in Windows 2000, you have two main choices: the Microsoft Cluster Service and network load balancing. Learn the advantages and disadvantages of each.

Microsoft Windows' two main clustering technologies, the Microsoft Cluster Service (MSCS) and network load balancing (NLB), provide high availability solutions, but they're aimed at mitigating different problems. To help you understand when you should choose one technology over the other, I'm going to examine the pros and cons of each.

A note on redundancy
MSCS and NLB both require some redundancy because each node (server) in the cluster must be able to accommodate all of the clustered services in case of failure. However, you may need less redundancy with NLB than with MSCS. MSCS on Windows 2000 Advanced Server supports only two servers, and each server should be configured to run at a maximum of 50 percent capacity during normal use. NLB can support as many as 32 servers in the cluster, proportionally reducing the redundancy required.

Microsoft Cluster Service
  • MSCS offers fault tolerance of resources, services, and servers.
  • Most administration can be done within a standard Microsoft Management Console (MMC), with an equivalent command-line utility offering scripting capabilities.
  • Many Win2K services (e.g., WINS, DHCP, file shares, printing, IIS, Exchange, and SQL) are now cluster-aware to take advantage of this technology, and many generic applications and services can be configured in the cluster.
  • MSCS enables you to monitor at the resource level and customize dependencies—for example, not just monitoring the IIS service but also the network name, address, and disk that this service uses.
  • You can configure failed clustered groups to restart before attempting to failover.
  • MSCS works independently of the protocol that is being used to access the servers.
  • MSCS can have a mixed-version cluster (one NT4 server and one Win2K server).

  • MSCS requires special hardware (the whole unit must be on the clustering section of the Windows Hardware Compatibility List) because fault-tolerant clustered servers (e.g., SCSI disks) require shared storage that should be hardware RAID protected.
  • Servers must be in a domain (either NT4 or Win2K) and cannot be in a workgroup. They are therefore better suited to back-end servers than those on a demilitarized zone (DMZ). Note that permissions should be assigned to global and not local groups.
  • MSCS is heavily dependent on good hardware. Effective clustering runs on the assumption that the data is good. It cannot protect or recover from corrupted databases or failed disks.
  • When a clustered server fails, there will still be some disruption. Connected users will be disconnected and typically asked to reauthenticate on the failed-over server.
  • The shared data store does not support IDE disks, software RAID, dynamic volumes, EFS, mounted volumes and reparse points, or remote storage.
  • Currently, Win2K Advanced Server supports only two servers (Datacenter supports four), but this will increase to four with Windows .NET Server. This restriction severely reduces the capacity to scale out as needed and reduces fault tolerance.
  • Because of length restrictions of the interconnecting cable between servers, MSCS cannot offer clustering over geographical locations.

Network load balancing
  • NLB offers fault tolerance at the network layer, ensuring that connections are not directed to a server that is down.
  • NLB is good for scaling out. It supports up to 32 servers per segment and can also employ DNS round robin for multiple clusters on different segments.
  • NLB works as a driver rather than as a service (as it did in NT4), which makes it more efficient.
  • NLB is highly configurable, with rules allowing for client affinity, weighting, filtering, etc.
  • NLB is ideal on DMZs to simply distribute traffic to back-end servers and to encrypt/decrypt data to offload this task from busy back-end servers.
  • No special hardware is required. You simply need two network adapters to mitigate a single point of failure.
  • You can have a mixed-version cluster (NT4 and Win2K).

  • NLB is unable to detect if a service on the server has crashed, so it could direct a user to a system that can't offer the requested service.
  • There is no shared data. NLB can direct clients only to a back-end server or to independently replicated data.
  • You can use NLB only with TCP/IP traffic—although this is not a problem on most networks today. But that does not include IPSec traffic.
  • You can't use NLB with Token Ring adapters or Layer three switches. You need to put a hub between servers and a switch.
  • After initial configuration, maintenance and monitoring is done with a command-line utility, which is good for scripting but not as easy as a GUI application, such as an MMC, for ad hoc administration.
  • Windows 2000 Advanced Server does not include a single, centralized NLB configuration across servers. Therefore, all servers must be configured individually.
  • All servers in a cluster must be in the same subnet. To provide redundant network links with NLB, you'll need two (or more) clusters, and you'll need to employ DNS round robin.

MSCS and NLB serve different purposes for organizations that need to ensure high availability for their network services. It's important to understand the strengths and weaknesses of each technology so that you know which one to deploy for various solutions. Also, as I pointed out in my previous article, there are times when the two technologies can work together as part of a single solution.

Editor's Picks