Clustering servers together increases the scalability and availability of networked resources. The bad part about clustering, though, is that, traditionally, clustered servers had to be very close to each other—often in the same room next to each other.
But a new clustering technology is slowly starting to emerge: long-distance clustering. Long-distance clustering allows you to place cluster nodes farther apart than previously possible. However, long-distance clustering presents several special challenges of its own. Let’s take a look at how long-distance clustering works and what you need to know to get it properly set up.
Clustering basics
A cluster is a group of servers that function as one. There are two main types of clusters: a network load balancing cluster and a hardware cluster. A network load balancing cluster consists of multiple servers, each running a common application.
In the hardware cluster, two or more servers are connected to the network, to each other, and to a shared disk array. There are several variations of this particular clustering model. However, the idea is that one server is active and the other server is inactive. The two servers exchange a heartbeat across a dedicated network connection. If the heartbeat stops, the other server picks up the workload.
For more information about clustering, see the Daily Drill Down “Cluster file shares on your Windows 2000 servers to increase availability.”
Reasons to use long-distance clustering
Traditionally, network clusters involve computers in close proximity to one another. Long-distance clustering is useful from a business continuity standpoint. Imagine for a moment that you have two buildings, and those buildings are spanned by a long-distance cluster. If one of the buildings was destroyed by fire, tornado, or terrorist attack, your business could continue to operate because you would continue to have a functioning cluster node in the other facility. Keep in mind, though, that there is only a single storage device shared by the cluster nodes. So, if the facility that contains the storage device is destroyed, you’re out of business. However, if the opposite facility is destroyed, your cluster will continue to function.
At best, that disaster recovery model seems a little haphazard; but let’s take a look at it in less catastrophic terms. Suppose the entire building isn’t destroyed but that, instead, a server fails. Regardless of which server fails, the other server would continue to run, and all applications and data within the cluster would remain available to all users.
The challenges of long-distance clustering
There are three main components to any clustered node. First, there’s the private network connection. This is the connection to your private network, by which users access data and applications. Second, there’s a dedicated network connection. This dedicated network connection is used only within the cluster and is used exclusively for monitoring node health. Finally, there’s the link to the shared storage device. To make long-distance clustering work, all three of these components must be spanned.
The dedicated connection
Let’s begin by talking about the dedicated connection between the clustered nodes. The dedicated connection is used for transmitting and receiving heartbeats. For all practical purposes, a heartbeat is simply a ping. Heartbeats are sent in 0.7-second intervals with up to 0.2 seconds allowed for variation. If a heartbeat is not received from another cluster node for 0.9 seconds (the 0.7 second scheduled time plus the 0.2 seconds allowed for variation), the functioning cluster node begins pinging the malfunctioning cluster node.
If the malfunctioning cluster node doesn’t respond within 5.3 seconds, the functioning cluster node tries to reestablish contact with the malfunctioning node via the private network (instead of the dedicated network). If the malfunctioning cluster doesn’t respond to pings sent over the private network, the node is assumed to be dead, and failover procedures begin.
The network connection
Before I get into discussing the specific hardware configurations, there are a few prerequisites you need to know about. Regardless of the distance you’re covering, your transmission medium between the clustered nodes must be fiber-optic-based and must use TCP/IP. The actual cabling requirements and necessary GBIC modules and connectors differ depending on the distances that you plan to cover.
For spans of up to 500 meters, you can use 50 um or 800 um shortwave, multimode fiber optic cable. This cable utilizes an SX or FX connector and supports a shortwave GBIC module.
For spans of up to 10 km, you’ll need 9/10 um, 1300/1500 nm, longwave, single-mode, fiber optic cable. You’ll also need an FX connector for distances up to 2 km, and an LX connector for distances greater than 2 km, but less than 10 km. The GBIC module must support longwave.
Not all of the technology necessary for long-distance clustering spans of above 10 km is publicly available yet. However, according to HP, spans of over 10 km will require you to use 9 um, 1550 nm, longwave, single-mode fiber optic cable. The cable will use an LX connector (which hasn’t been released yet) and will use a longwave or a very longwave GBIC module, depending on the distance that you’re covering.
There are several topologies you can use for sending the necessary data between the clustered nodes. Two of the most common choices are ATM and Gigabit Ethernet. I recommend using Gigabit Ethernet for several reasons.
First, Gigabit Ethernet is comparatively inexpensive and easy to implement. In a short-range implementation, Gigabit Ethernet will work at a range of 550 meters over multimode fiber optic cable. In such an implementation, you can also get away with using shortwave lasers with a 50-micron core size. Gigabit Ethernet will also work at up to 5 km using 10 um, 1300 nm, single-mode fiber optic cable. There are also Gigabit Ethernet solutions available for even longer ranges.
In addition to being cheap and easy to implement, Gigabit Ethernet is completely compatible with existing fast Ethernet technology. This means that a packet can flow from a fast Ethernet network, over a Gigabit Ethernet network, and back to a fast Ethernet network without having to be translated, as would be the case with ATM. The fact that the packet doesn’t have to be translated to use a different architecture means that there are no extra bottlenecks caused by the switching process.
Two other benefits of Gigabit Ethernet are that it natively supports TCP/IP and that it supports full duplex communications. These various factors make Gigabit Ethernet ideal for long-distance clustering. Although Gigabit Ethernet is usually the ideal topology for long-distance clustering, it is by no means the only choice. FDDI, Giganet, and ATM are all valid choices.
If you look at the official specifications for Gigabit Ethernet, you’ll see that it is only designed to perform at distances of up to 5 km. However, HP has performed extensive testing and actually gotten Gigabit Ethernet to perform well at distances of up to 10 km. Additionally, there are speculations that under the right conditions, Gigabit Ethernet could work at distances as great as 50 km.
In tests in which the Gigabit Ethernet distance specifications were exceeded, HP relied on 1550 nm, high-quality, fiber optic cable. Additionally, these test results were achieved using HP ProCurve Ethernet switches equipped with 1000 LX longwave and 1000 SX shortwave Gigabit Ethernet modules.
If you’re looking at longer distances and higher capacity clustering spans, you might be able to take advantage of a technology known as dense wavelength division multiplexing (DWDM). DWDM is a new technology that uses a combination of several lasers to simultaneously send signals over a single fiber optic cable. This particular technique is possible because each laser uses a different wavelength of light or a different color of light. These different wavelengths and colors make it possible to split the different signals apart.
You might have heard of wavelength division multiplexing (WDM) as this technology has been around for a while. WDM is a technique that uses two normal lasers to place two different signals onto a fiber optic cable. WDM places one signal in the 1300 nm range and the second signal in the 1500 nm range. The reason that I mention this is that DWDM is a direct expansion on WDM technology.
The difference between the two technologies is that some of the WDM implementations use up to four wavelengths, spaced 400 GHz apart. DWDM, on the other hand, allows for up to 160 wavelengths, spaced a mere 50 GHz apart. Furthermore, each of these 160 wavelengths, or channels as they are sometimes called, can carry up to 10 gigabits of data per second.
DWDM technology works in part because of the Erbium Doped Fiber Amplifier (EDFA). This amplifier functions only in the 1500 nm pass band. The real magic happens as inline EDFAs amplify the signal as it travels to the next amplifier or termination point. In case you’re wondering about distance, DWDM can function at distances of up to 800 km with up to 120 km between amplifiers.
At the present time, DWDM is used mostly for high-traffic backbones flowing over long distances. In fact, this technology has been used to connect entire cities to each other. In spite of its current usage, you can expect to see more DWDM technology entering the private sector in the not too distant future. DWDM technology is much less expensive than SONET technology because of its scalability. You can expect to see DWDM going mainstream because a single DWDM system can easily support 100 gigabits of traffic over a fiber pair, while the absolute maximum for SONET is 10 gigabits, and that’s only over exceptional quality fiber. The practical limit for SONET over standard grade fiber is a mere 2.5 gigabits.
There is a potential future for SONET technology, however. SONET networks can increase their capacity through time division multiplexing. This means that the line’s timing is divided into smaller and smaller increments to allow more and more data to flow through the line.
The downside to time division multiplexing, though, is that when service providers implement it, they must make the leap to the higher bit rate all at once. This means that they must purchase more capacity than is initially needed. Based on the current SONET implementation, predictions are that the next incremental jump from the currently available 10 gigabits will be to 40 gigabits. Just don’t expect this to happen any time soon, however, since SONET presently has trouble even achieving 10 gigabit speeds.
The storage connection
Obviously, current technology limitations prevent you from being able to stretch a SCSI cable for extended distances. Distance clustering solves this problem by using the same medium for both storage data and heartbeat data. Both are sent over longer distances by way of IP packets. All you need to do is make sure your storage devices are on the same network as your servers, and you’re ready to go.