In the article "Things to consider when planning server clusters", I explained that the first step of planning any server cluster is to weigh the cluster's cost against its benefit and against the cost of downtime. In this article, I will assume that you have decided to move forward with deploying a cluster, and will discuss the particulars of planning your cluster deployment.
Now that you've decided to move forward with building a cluster, your first consideration should probably be the cluster hardware. If you have been working with the Windows Server for many years, then you probably recall a time when adhering to the Hardware Compatibility List was critical. At best, Microsoft would refuse to support hardware that was not listed on the Hardware Compatibility List. At worst, the hardware simply wouldn't work with Windows Server.
Today, just about any server hardware will run Windows without any problems. Even so, I would highly recommend purchasing only hardware that is listed on the Hardware Compatibility List when it comes to clusters. There are a couple of reasons for this. First, clusters can be a bit finicky. If you are implementing a cluster to increase the reliability and fault tolerance of a mission critical application, then it makes sense to invest in hardware that is proven to be reliable and fully compatible with your cluster solution. You can access a list of Microsoft approved cluster hardware at Microsoft's Web site.
Another reason why I recommend adhering to the Hardware Compatibility List is because doing so guarantees that Microsoft will provide you with technical support should you ever need it. It has been my experience that Microsoft's technical support service is a lot less strict about requiring server hardware to be listed on the Hardware Compatibility List than they used to be. Even so, Microsoft's Website specifically states that they do not support Windows running on hardware that is not listed on the Hardware Compatibility List. If that isn't convincing enough, then imagine trying to explain to your boss that you cannot get support for the failure of a mission critical server cluster because decided to try to save a few bucks by buying cheap hardware.
As you can see, there are definite advantages to selecting hardware that is listed on the Hardware Compatibility checklist. Of course there is quite a variety of hardware on the list. Therefore, there are some additional guidelines that you should follow when picking out cluster hardware.
For starters, you must ensure that each node in the cluster is running the same processor architecture. For example, a cluster can not contain both 32-bit and 64-bit servers. Another consideration is that although a clustering technically only requires each cluster node to contain a single Network Interface Card, I recommend that you install a second Network Interface Card into each cluster node. The reason for doing so is that it allows you to isolate the communications between cluster nodes to a dedicated network segment. One of the Network Interface Cards in each cluster node will be used for normal network traffic, while the other will be used solely for communications between nodes. This prevents communications between cluster nodes from congesting your network with extra traffic, and prevents latency related to other types of network traffic.
In addition to these network connections, each cluster node must also have a dedicated connection to the shared storage device. For this connection, Windows Server 2003 supports the use of both SCSI and fiber channel.
There are two different types of SCSI based clusters. There
is the traditional SCSI implementation, and there is iSCSI. iSCSI involves
transmitting SCSI commands over TCP/IP, and is used for creating geographically
dispersed clusters. The more traditional, hardware based SCSI implementation
usually has a maximum bus length of
As I'm sure you know, most PCs come with IDE or SATA drives. Most server grade hardware on the other hand, comes with SCSI drives. Just like IDE and SATA, SCSI drives are connected to a controller that is responsible for sending requests to the drive. The reason why SCSI is preferable to IDE in server hardware has to do with the bus architecture.
When an IDE drive receives a request from its controller, it returns a response to the request. The controller will not send another request until it has received this response. The fact that responses are required after every request makes IDE drives very inefficient. In contrast, SCSI drives offer better performance because they are not required to transmit responses to the SCSI controller after each request. This allows the SCSI controller to transmit requests in rapid succession.
Although SCSI storage devices are fast and efficient, they do have their limitations as clusters go. In a normal SCSI implementation, the server contains a SCSI adapter that allows it to connect to various SCSI storage devices. The storage devices might be internal or they might be an external. In either case, the devices are attached to the machine SCSI bus.
The actual number of storage devices that are supported on a SCSI bus varies depending on the SCSI version in use. For example, the original SCSI standard allowed for up to eight devices, but there are other versions of SCSI that allow 16 devices. In any case, the SCSI adapter itself counts as one of the devices on the SCSI bus. Each device on the SCSI bus, including the SCSI adapter, is assigned a SCSI ID. The SCSI ID is simply a unique number that is assigned to each device on the bus. For example, the original SCSI implementation supported up to eight devices, and used SCSI ID numbers zero to seven.
Another requirement of the SCSI bus is that both ends of the bus must be terminated. Termination refers to using electronic resistors to absorb the signal that is traveling across the bus when it reaches the end of the bus. This prevents the signal that has been absorbed from interfering with the next signal flowing across the bus. Some SCSI implementations make use of a physical terminator that attaches to the last device on the bus. Other SCSI implementations are auto terminating.
Now that is giving you a quick crash course in the SCSI architecture, let's talk about how this architecture affects server clusters. As you already know, each node within a server cluster requires a dedicated connection to a shared storage device. If the SCSI architecture is in use, then this means that the shared storage will be SCSI based. It also means that each cluster node must use a SCSI adapter to connect to the shared storage device.
The interesting thing about this configuration though, is that the shared storage device can not be a part of multiple SCSI busses. As such, the cluster nodes and the shared storage device are all a part of a single SCSI bus. Remember that the SCSI adapter counts as a device on a SCSI bus.
Because of SCSI's architectural limitations, a SCSI based cluster is limited to no more than two cluster nodes. A typical SCSI based cluster looks something like the one that's shown in Figure A. Additionally, these nodes must be running the 32-bit version of Windows Server 2003 Enterprise Edition. Additional cluster nodes and SCSI hubs are not supported in SCSI based clusters.
|A SCSI based cluster can have no more than two nodes.|
Fiber channel clusters
The other type of connectivity that Windows supports between cluster nodes and storage devices is fiber channel. Fiber channel was originally designed for general network connectivity, but is primarily used for communications between servers and storage devices.
The primary difference between fiber channel and SCSI is
that SCSI uses parallel signaling, while fiber channel uses full-duplex serial
signaling. Because SCSI uses parallel signaling, it may sound as though SCSI
would have better performance, and should be the connectivity of choice. Fiber
channel does have its advantages though. Because fiber channel does not rely on
parallel signaling, it is much less sensitive to distances than SCSI is. As you
may recall, SCSI implementations are limited to a total bust length of a mere
Another advantage to fiber channel is that it was originally designed as a networking technology, similar to Ethernet. As such, it supports the simultaneous connection of many more devices than SCSI does.
Typically, when fiber channel is used for clustering, each cluster node will contain a fiber channel host adapter. This adapter is then used to attach the node to one or more shared storage devices. Of course the actual implementation method used depends on the fiber channel topology in use. There are two different fiber channel topologies supported by Windows Server 2003; Fiber Channel Arbitrated Loop and Fiber Channel Switched Fabric.
Fiber Channel Arbitrated Loop
A Fiber Channel Arbitrated Loop gets its name because it is based on a ring topology in which the cluster nodes and the shared storage devices are all a part of a ring, as shown in Figure B.
|A Fiber Channel Arbitrated Loop follows a ring topology.|
Earlier I mentioned that fiber channel is not is limited as SCSI when it comes to the number of devices that can be plugged into the bus. A Fiber Channel Arbitrated Loop supports a total of 126 devices. Before you get too excited though, it is worth noting that the Windows operating system imposes its own constraints. Because of these constraints, only two of the 126 devices can be cluster nodes. This means that a Fiber Channel Arbitrated Loop cluster can contain a huge number of storage devices, but only two nodes.
In case you're wondering, the reason why Microsoft imposed the two node limitation has to do with the fact that the Fiber Channel Arbitrated Loop topology uses shared bandwidth. If you look at the diagram shown in Figure B, you'll notice that no switch is present. Instead, there is a continuous fiber channel loop that passes through each device. This means that a signal may have to pass through one device to get to another. This means that each device in the loop must compete for bandwidth. The only way that Microsoft was able to guarantee that each cluster node would have sufficient bandwidth available was to limit the cluster to having no more than two nodes.
Fiber Channel Switched Fabric Network
A Fiber Channel Switched Fabric Network is the only Microsoft cluster solution that supports more than two nodes. If you look at Figure C, you will see that the architecture of a Fiber Channel Switched Fabric Network closely resembles that of an Ethernet network. Like an Ethernet network, each device is connected to a central switch, rather than to another device. This means that any device on the network is capable of establishing a direct path to any other device on the network.
|A Fiber Channel Switched Fabric Network uses a centralized switch to provide direct connectivity between individual devices.|
As you have probably already guessed, a Fiber Channel Switched Fabric network is by far the most flexible, but most expensive type of cluster that Windows Server 2003 supports. When you build a Fiber Channel Switched Fabric Network, you are effectively building a Storage Area Network. Storage Area Networks are extremely flexible by their very nature. You can easily add additional cluster nodes or additional storage devices to the network anytime you want. This type of network design is also suitable for creating geographically dispersed clusters in which cluster nodes reside in various cities.
In this article, I have discussed the various clustering architectures that are supported by Windows Server 2003. As I did, I explained what each architecture is suited for, as well as its limitations. In the next part of this article series, I will continue the discussion by talking about application deployment within the cluster.