Windows clustering: Microsoft Cluster Service or network load balancing?

Microsoft Cluster Service (MSCS) and network load balancing (NLB) can both help you address your high availability needs. They can even work together in some solutions. Get the lowdown on each technology, and see how they can cooperate.

The first of my recent articles on Windows clustering prompted a member to ask about the differences between the Microsoft Cluster Service (MSCS) and Microsoft’s other clustering technology, network load balancing (NLB). This is an important topic, and to do it justice, I decided to dedicate an entire article to it. I'll compare MSCS and NLB, both in terms of how they work and what solutions they offer. Along the way, I will point out some typical scenarios where the technologies can be used together to provide a high availability solution.

Carol Bailey's Windows clustering articles

Cluster Service explained
I've previously explained how and why you might use the Cluster Service in Windows 2000 Advanced Server for achieving high availability with services like virtual servers, file shares and dynamic shares, Dfs roots, printers, WINS, DHCP, and IIS—in addition to the more traditional clustering services, such as Exchange and SQL.

With this service, typically, two servers (up to four, if you're running Windows 2000 Datacenter Server) share a data store so that another server can continue to offer the same data and service if the original fails (or if any dependent resource fails and can't be restarted). Benefits include not just fault tolerance to mitigate server or service failure, but also accommodating planned maintenance downtime for upgrades and security patches.

It’s important to reiterate that the Cluster Service does not protect the data. A shared data store simply means that more than one server can access it. The Cluster Service always assumes the data it offers is good, so you must ensure its integrity independently with hardware RAID (software RAID is not supported), UPS, good quality hardware, and so on.

The monitoring of availability is very granular. It’s not just the clustered service itself that is monitored, but all its dependencies. So, for example, a clustered printer would monitor not just the spooler service but also the disk it was using for spooling and the virtual server’s IP address and name. Any of these resources can be configured to automatically attempt to restart if it fails. You can set it up so that after a specified number of failures, it fails over to another server. Should the original server be able to offer the service again (e.g., the operating system disk has been repaired), the clustered service can be failed back to the original server.

You administer the Cluster Service with the Cluster Administrator MMC or with the command-line utility Cluster.exe. In addition to ensuring stable, reliable storage hardware for the shared data store, the servers in a Cluster Service cluster must be in an NT4 or Windows 2000 domain.

Network load balancing explained
NLB is also a high availability solution, but it works at the network level (as its name implies) rather than at the resource level (as we see in the Cluster Service). It also functions as a driver rather than as a service, as it did in NT 4.0. Up to 32 servers can be clustered in the same subnet, appearing to clients as one server and distributing the load between them.

Each server in the network-load-balanced cluster is aware of each other, and when one server fails to respond, the rest converge and automatically redistribute the new connections among themselves. This offers true horizontal scaling because you can simply add another server to the cluster when network load increases beyond the existing servers’ capabilities. When you reach the maximum of 32 servers, you can start another cluster and use DNS round robin to alternatively hand out the clustered (virtual) addresses. Check out this article for a clarification on the differences between NLB and DNS round robin.

Note that there is no sharing of data between servers running NLB, so with Windows 2000 Advanced Server, you will have to independently configure each server and replicate data that’s needed for the load balanced services. Windows 2000 Datacenter Server offers Application Center 2000 to help manage multiple NLB servers, but this setup is too expensive and complex for many corporate networks.

NLB has a reputation for being used with Web servers, but you can also effectively use it on your corporate network with services such as file servers and printers, terminal servers, and even PPTP servers. However, IPSec cannot be load balanced, and this applies to straight IPSec in both transport and tunnel mode and L2TP/IPSec.

You enable and configure NLB in the General tab of the Network Load Balancing properties for your network adapter. Here, you enter the parameters you want to use with the three tabs, Cluster Parameters, Host Parameters, and Port Rules. Monitoring and maintenance is done with a command-line utility called WLBS.exe, which you can run locally or remotely (if enabled). If you choose to run it remotely, you can also password protect it for obvious security reasons.

Unlike Cluster Service, NLB requires no special hardware. The only stipulation is that two network cards are used to eliminate a single point of failure. And unlike Cluster Service, you don’t have to run these servers in a domain. They can be in a workgroup outside your domain, thereby presenting a smaller security risk.

MSCS and NLB working together
When deciding between these clustering technologies, it helps to have a good understanding of exactly what they protect and how. Both have their advantages and disadvantages, some of which we’ve covered here and some of which I’ll cover more extensively in my next article.

You cannot run both clustering technologies on the same server, but you can run them together in the same solution. With such an arrangement, Microsoft would portray them as complementary rather than conflicting, providing, of course, that hardware and licenses are plentiful.

Let's consider two examples that typify how and when you might use both advantageously: database-driven Web servers and e-mail access. We’ll also look at a lesser-known scenario, using the same principles but employing some more obscure clustering techniques.

In all cases, the NLB servers are placed in the DMZ, simply accepting and rerouting traffic from the Internet to back-end servers that are running the Cluster Service. Because no data is held on the NLB servers, the security implications are lessened. And because the servers are running very few services, they are easier to harden. The data is kept secure behind another firewall. Connecting users are unaware of where that data is held and will have minimal inconvenience if the back-end servers change. Lets see how this works.

Web servers are placed within the DMZ and configured to accept traffic only on port 80 (http) and 443 (https). All other ports can be blocked. If SSL is being used, these servers all need a certificate installed and will be responsible for the encryption/decryption over the Internet. All servers will have Internet Information Services (IIS) running, configured identically, but the actual data being used (such as an online shopping store) is stored in a SQL database running on a back-end server (behind another firewall), which is running the Cluster Service.

The SQL back-end server is more secure because it's not directly available on the Internet, it won't have the overhead of SSL, and it can be independently reconfigured without requiring external changes (to the DNS or on clients). The data is monitored by the Cluster Service to ensure that it remains available on at least one of the clustered servers. You can also use IPSec so that the SQL servers accept connections only from the NLB servers. This will provide an additional safeguard on top of your firewall filtering rules and any router configuration.

For Exchange 2000 offering IMAP4, POP3, and OWA for Internet-connected clients, the process is similar. The servers in the DMZ that accept the Internet connections are running NLB, with the actual data (e-mail, public folders, and calendar) held on back-end servers running the Cluster Service. Only this time, each NLB server that has Exchange 2000 installed must have an additional option checked to enable it as a front-end (FE) server—a new feature in Exchange 2000. Then, make sure you remove any data stores on these FE servers.

In this scenario, the NLB servers will need to communicate with more servers on the internal network than just the Exchange servers that hold the data stores. Exchange 2000 FE servers locate the back-end Exchange data stores by querying the Active Directory Global Catalog. And because this service is for your internal domain users rather than for anonymous public access, users must be authenticated with domain controllers. The Global Catalog and domain controllers are vital services, which, in this scenario, deliver an extra layer of security by being only indirectly accessible by the Internet. As before, encryption for added security will require a server certificate on the NLB servers, which offloads the encryption/decryption process from the back-end servers.

Our final example involves multiple VPN servers offering PPTP connections, all with NLB configured. Remember that IPSec won’t work with NLB, so you can’t use L2TP/IPSec. This means that you'll have to use PPTP. Once connected and authenticated, users connect to back-end clustered file servers. Users just need to know one IP address (or server name) for their remote connection, irrespective of how many VPN servers are actually available. Access to their data is always protected on clustered servers.

Putting it all together
We’ve taken an under-the-hood look at how both Microsoft clustering technologies work, including their differences and similarities, and how they can work together to offer highly available solutions. Delving a little deeper into specifics, my next article will draw up a list of pros and cons for each technology to help you decide which to use when you have to choose between them.

Editor's Picks