Recently, my organization engaged in the process of upgrading our internal data systems to Windows 2000 and Active Directory. We encountered two key problems and had to apply fixes for both. In this article, I’ll guide you through the problems and the fixes so that you will know what to do if you encounter these issues—or better yet, so you can take the preemptive steps necessary to avoid these problems during your Windows 2000 upgrade.
The upgrade process
To begin the upgrade, we established a Forest Root Controller named SOL and created an AD structure on paper and on the Forest Controller. The AD layout consisted of a root domain (tech-adv.net) and two child domains (office.tec-adv.net and internet.tec-adv.net). Each of these child domains existed as a stand-alone Windows NT 4 domain with a PDC and a BDC in each domain prior to the upgrade process. We retained the original NT domain names (OFFICE and INTERNET) to avoid naming conflicts.
About 48 hours after we set up our Forest Root Controller and installed Windows 2000 and AD, we proceeded to upgrade the first server in the OFFICE Windows NT domain. In this case, it was a PDC called BEAGLE, built on Windows NT 4 SP6a. We ran the Upgrade Readiness Checker (originally available from Microsoft but since removed from its site and replaced with a simple compatibility list) and found that aside from some NIC driver upgrades, which we performed, the server was ready to be upgraded.
The Windows 2000 Server installation on BEAGLE went well, without any major issues once we changed the DNS settings in BEAGLE’s TCP/IP properties to point to SOL. After the installation of SP1 for Windows 2000, we ran dcpromo and set up the former PDC as an Active Directory domain controller for office.tec-adv.net. Once again, no errors were reported during the procedure, and we allowed the servers to run overnight so replication could occur.
Errors begin to arrive
The following day, we began to notice odd errors, generally pointing to replication failure. Some errors reported the inability to assign cross-domain permissions or for Enterprise Admins from the Forest to administer the child domain resources. Checking the NTDS settings for the various servers in the AD site led to some interesting insights. First, while there was a connection agreement from BEAGLE to SOL, the Active Directory Sites And Services snap-in running on SOL could not see the complementary agreement on BEAGLE. The reverse of the situation was showing up in Active Directory Sites And Services on BEAGLE, showing the agreement from SOL but not the corresponding agreement on SOL from BEAGLE.
Any attempt to manually force replication on either machine always caused an error. Synchronization from the child domain caused an RPC Server Is Not Available error, and synchronization from the parent caused a The Naming Context Is In The Process Of Being Removed Or Is Not Replicated From The Specified Server error. The DNS settings and connectivity between the two DCs seemed correctly set up, and no other explanation for the error could be identified.
A search of the MS knowledgebase turned up one article (Q281485), which noted the error we saw when attempting to synchronize from the parent domain and listed a possible solution involving a hotfix patch available from MS tech support. After a support call to MS to obtain the patch, we followed the instructions in the article and then rebooted both servers, ensuring that the parent DC rebooted first. After the patch procedure was complete, we found that the errors were still occurring and called MS tech support again to attempt to find another solution.
Identifying a solution
Working with the MS Directory Services team, we were able to determine that a possible cause of the problem was that BEAGLE was assigned an incorrect network name (beagle.tec-adv.com) during the upgrade process. This occurred because under the original NT configuration, the TCP/IP DNS properties for that machine had included a host and domain name for the server, which Windows used to name the server during the upgrade process. The only solution to this problem was a script—again only available from MS tech support—which rewrote registry and AD keys and essentially renamed the server after a reboot. Of course, the potential to destroy the server during a procedure like this is rather high, so MS recommends proceeding only after a system backup.
After BEAGLE’s reboot, the server name in the Network Identification tab of the Control Panel’s System applet showed correctly (beagle.office.tec-adv.net), and we could proceed with further testing. We attempted to synchronize the AD data again and once again failed—with the same errors. It was at this point that we enlisted the aid of the MS tech support’s networking team to go over the topology of our network for possible problems.
The MS rep immediately identified a problem: We were using two NICs on our DC, one for our internal network (using NAT addressing, with an assigned IP from the NAT pool) and one for our public-facing network (using an assigned public IP address). Because of the possibility of replication occurring over the wrong NIC, Microsoft officially supports only configurations on domain controllers where there is only one NIC card. We checked to ensure that the NICs were set up in the correct order (by opening Networking And Dial-Up Connections, selecting Advanced Settings from the Advanced menu, and confirming that the internal NIC had the highest priority). Officially, the only other option was to disable the external NIC and then try the replication again. We did so, and suddenly the replication began working perfectly.
That was all well and good for our immediate concerns, but we required the additional NIC for administration and other purposes. The official MS solution was to put the second NIC on a member server in that domain and use that server for any external needs. Unofficially, working with MS tech support, we were able to map out another strategy that works in this scenario.
First, we had to ensure that there was only one gateway available to the server no matter how many NICs were present in the system. To do this, we set the gateway on the external NIC to the router on-site and left that field blank in the TCP/IP settings of the internal NIC. Next, we had to make sure that only one NIC was showing up in the Dynamic DNS system for SOL (the Forest’s DNS master). To accomplish this, we opened the TCP/IP properties of the external NIC, clicked Advanced, and then selected the DNS tab. There, we deselected the Register This Connection’s Address In DNS check box.
Checking the DNS system on SOL, we found that the entries for the external IP’s for SOL had disappeared, thereby leaving only the internal IP as visible to the rest of the internal network. This ensured that all DCs, when doing DNS lookups on SOL, would find the correct interface to run synchronization over. From this point onward, synchronization ran without error, even when both NICs were enabled on SOL.
In summary, when attempting to set up an Active Directory domain controller with multiple network interface cards, you must first ensure that only one default gateway is specified regardless of how many NICs are present. You must also unregister all but the internal interface from the Active Directory Dynamic DNS system and make sure that this internal interface has the highest priority of your NICs.
When attempting to upgrade an NT 4 server to Windows 2000 Server, you must also remove the Host Name and Domain Name data from the TCP/IP DNS properties page for all NICs; otherwise, Windows 2000 could accidentally use the incorrect naming conventions during the upgrade process. As for the availability of the various patches I mentioned, at the moment they are only available from Microsoft directly. The hotfix for the “naming collision error” will be part of SP3 for Windows 2000. The script required to change the server’s name is, and will always be, available only from MS tech support due to the volatile nature of the patch.
We look forward to getting your input and hearing about your experiences regarding this topic. Join the discussion below or send the editor an e-mail.