When you deploy a server cluster based on Microsoft Cluster Services (MSCS), you must follow a specific set of steps to build it and bring it online. These steps are required to protect the shared storage device’s data integrity until the cluster software is in place to protect it. In this Daily Drill Down, I’ll explain how to set up and configure an MSCS cluster.
Configuring the network
To begin the configuration process, verify that the shared disk is powered off, although it can be physically connected to the cluster nodes at this point. With the shared disk off, boot all of the servers and install Windows 2000; then, turn your attention to configuring the network adapters.
Each node must have two PCI network adapters. One adapter is used as the public cluster interface, and the second is used as the private interface for noncluster traffic and cluster management and status signals. Unless you are serving only local users, the public side of the cluster will probably reside on a public, routable subnet, or a private subnet behind a firewall. If you are serving only local users, the public side of the cluster can reside on a private, nonroutable subnet. For the purpose of this example, I’ll assume the latter and use 192.168.0.n for the public side and 192.168.1.n for the private side.
Start by configuring the TCP/IP properties for the private interface on the first node. Open the Network And Dial-Up Connections folder, right-click Local Area Network 2, and choose Properties. Open the TCP/IP properties and configure the IP address (in this example) to 192.168.1.100 with a subnet mask of 255.255.255.0. Specify the appropriate DNS servers for your network. Click Advanced, click the WINS tab, and select Disable NetBIOS Over TCP/IP. Then, close the TCP/IP properties and open the Properties sheet for the network adapter. Click the Advanced tab and set the actual media type and speed rather than allowing the adapter to automatically select these settings. Check and configure other adapter settings as necessary, keeping in mind that the settings need to be the same for the other nodes in the cluster (a good reason to use the same adapters across the cluster).
Next, configure the TCP/IP properties for the Local Area Network interface, setting the IP address to 192.168.0.100 and the subnet mask to 255.255.255.0. Set the properties for the network adapter as you did for the LAN 2 connection, except leave NetBIOS Over TCP/IP enabled. Verify that both connections are working. The next step is optional: Rename the network connections to signify their functions. Doing so will make it easier to keep track of the configuration as you go along. For example, rename Local Area Network 2 to Private and rename Local Area Network to Public. You might also add the last two octets of the interface’s IP address to its name to help you identify them more easily.
After you configure the IP addresses on all of the nodes, try pinging the public and private interfaces on each node. If that is successful, ping the NetBIOS name of each server to verify that name resolution is working properly.
Configuring domain membership and accounts
All nodes in the cluster must belong to the same domain and can be either member servers or domain controllers. If you decide not to install the servers as domain controllers, you don’t need to take any additional steps other than configure the servers to be domain members, which you do through the Network Identification tab of the System property sheet. Otherwise, run the Configure Your Server wizard from the Administrative Tools folder to install the Active Directory on the servers and configure them as domain controllers. If you’re not too keen about using wizards, you can also do so by running the dcpromo command at a command prompt or from the Start | Run menu. You can join them to an existing domain or create a new domain.
The next step is to configure a cluster account with administrative privileges in the domain. Open the Active Directory Users And Computers console on any domain controller in the domain and add a user account named Cluster in the Users container. Unless security policy prevents it, configure the account for the Password Does Not Expire and User Cannot Change Password options and then add the user to the Administrators group. Do not add the user to the Domain Admins or Enterprise Admins groups. If security policy prevents you from configuring the account’s password to not expire, add a reminder to your calendar to change the password before it expires.
Shut down all servers after you complete the network and domain configuration process. All nodes must be offline as you begin configuring the shared storage device.
Configuring shared storage
There are a handful of requirements for the shared storage between cluster nodes. First, the shared storage must contain at least two partitions, one for use as the quorum disk and at least one other for application and data storage. MSCS uses the quorum disk to store cluster configuration data and log files. The quorum disk partition must be at least 50 Mb, and Microsoft recommends a 500-Mb partition for the quorum disk.
When a node starts up, it tries to gain control of the quorum disk. Only one node can control the quorum disk. If the node determines that the quorum disk is already owned by another node, it joins that node’s cluster. If the quorum disk is not currently owned, the node forms its own cluster. When a node fails and a secondary must take over, it takes ownership of the quorum disk and restores the latest state from the disk so it can resume services without interruption.
MSCS also uses the quorum disk to determine which node should continue functioning in the event of a communication failure between nodes. This can be caused by a server failure or by a communication failure, such as a failed network interface. The node that owns the quorum device places a reservation on it every three seconds. If a failure occurs, the secondary node resets the bus—which releases the reservation—and then waits for 10 seconds. It then attempts to place a reservation on the quorum disk. If the primary node is still functioning, it will already have placed another reservation on the disk and the secondary will not become the active node. If the secondary node is able to place a reservation on the quorum disk, it indicates that the primary node is down, and the secondary becomes the active node and restores its state from the data stored on the quorum disk.
The quorum disk also ensures the ability for a node to obtain the latest state if it must take over as the active node. Normally, nodes in the cluster communicate with one another to share configuration and state information. When a failure occurs, however, that configuration data becomes stale. When a node comes back online, it restores the latest state either from the quorum disk (by taking control of it and becoming the primary) or by communicating with the other nodes in the cluster.
The quorum disk partition and data partition(s) on the shared storage device must use NTFS. In addition, you must create the partitions as basic disks rather than dynamic disks. Windows 2000 does not support dynamic disks on the shared storage device.
When you’re ready to start configuring the shared storage, verify that all nodes are shut down. Check connections between the shared storage and the nodes to verify the connections and termination. Then, power up the shared storage and the first node. Leave the other nodes off. Open the Disk Management branch of the Computer Management console. If you have yet to create the partitions on the shared storage, do so now. Create the quorum disk and data partition(s), formatting them as NTFS. Assign drive letters to each; in this example, I used drive Q for the quorum drive and Z for a data drive. After you create and format the partitions, verify that you can read and write to them. Create or copy a text file on each shared volume, verify that you can open and read it, and if so, delete the file. If you can’t create the file or have problems opening it, check your connections and termination for potential problems.
At this point, you’re ready to test the disk access by all of the other nodes. You must do this one node at a time. Shut down the first node and boot the second. Perform the same tests to determine that you can read and write to the shared volumes and then take down the second node. If you have additional nodes in the cluster, test those in turn, as well. Just remember that only one node can be on at a time without the clustering software in place. Otherwise, you’ll corrupt the shared storage volumes and have to start over.
Adding cluster nodes
When you are satisfied that all nodes can see and work with the shared storage, you’re ready to install the cluster software. Shut down the current node and start up the first node. Use the Add/Remove Programs object in the Control Panel to add the Cluster Service under the Add/Remove Windows Components link. When you click Next, Setup prompts you for the Windows 2000 CD and, after copying files, starts the Cluster Service Configuration wizard. Through the wizard, you specify that the node is either the first one in the cluster or that you are adding a node to an existing cluster. This is the first node, so select that option and then click Next. Enter a name for the cluster and click Next.
The wizard then prompts you to specify the account the cluster will use. Specify the account and password you created for the cluster account previously. In this example, I’ll use CLUSTER as the account name. Remember that the account must be a member of the Administrators group. Click Next after you enter the account credentials and domain.
The wizard next prompts for the disks that it will manage. These include the quorum disk and cluster data disks. The wizard should automatically select drives for you. Verify that it has selected the Q and Z drives you created earlier and then click Next. The wizard then prompts you to select the drive for checkpoint and log files (the quorum drive), so select drive Q and click Next.
The wizard next steps you through the process of setting up network connections for the cluster. First, you specify the role for each interface. For example, you specify that the private interface handles internal cluster communications and the public interface handles client access traffic. You can configure an interface to handle both internal and client traffic (called mixed network mode), but in this example, I’ve separated the traffic. So select the private interface, select the Internal Cluster Communications Only option, and click Next. Then, for the public interface, select All Communications and then click Next.
By specifying that the public interface can be used for both public and private cluster traffic, you provide a second path for internal traffic as a backup and avoid a single point of failure that would prevent cluster communication should the private interface fail for some reason. If you configure the service to allow internal cluster communication on both interfaces, you also must specify the priority of those interfaces. The wizard prompts you to do so. The interface at the top of the list is used for internal cluster communications unless it becomes unavailable. Verify that the private interface is listed above the public interface or use the Up and Down arrow buttons to change the order. Then, click Next.
In the final step of the wizard, specify the IP address for managing the cluster, as well as the public interface through which clients access the cluster. After you enter the necessary information, click Next and then click Finish. Setup completes the installation and configuration process and starts the Cluster service.
After the first node is up and running, you can add other nodes to the cluster. Leave the first node online and boot the second node. Use the Add/Remove Programs object to add the Cluster service. In the Cluster Service Configuration wizard, specify that the node is joining an existing cluster and then answer the remaining prompts to complete the setup. You’ll find that you have fewer options to specify for a secondary node, which means an easier and shorter configuration.
Using clusters helps you provide users with reliable systems. However, when deploying a cluster on a Windows 2000-based network, using Microsoft Cluster Services can get a bit confusing. Follow my step-by-step instructions for a smoother setup and configuration, and look for more handy advice in my upcoming articles on MSCS clusters.