DHCP is a boon for today’s overworked LAN administrators because it takes care of all the IP address administration you’d otherwise have to handle. But when DHCP breaks, your day can go to the dogs unless you have a DHCP disaster recovery plan in place.
In this Daily Drill Down, I’ll go over the steps necessary to carry out a successful recovery of a broken DHCP installation on your Windows 2000 server. I’ll also outline some best practices you can use to minimize the disruption caused by DHCP server problems. While these practices won’t eradicate the problems that do crop up, they’ll help you handle problems more efficiently—and without an endless parade of users asking you why they can’t use the LAN.
Start with the obvious
First things first: If DHCP-enabled clients aren’t communicating with the rest of the LAN, you’re obviously not going to sit right down at the DHCP server and try to figure out what’s gone wrong. Remember Networking 101 and start with the basics:
- Check the PC’s network cable(s) and the network card(s) first.
- Ping the local loopback address.
- Use the Ipconfig/all command at the command line of a workstation. If you get an address in the range 169.254.x.x on a Windows 2000 client, you’ll know that the client was unable to obtain an IP address from the DHCP server.
- Try using the Ipconfig/release command followed by Ipconfig/renew. Reboot the failing workstation.
- Reinstall the TCP/IP stack on the failing workstation.
If the problem PCs are on a different subnet from the DHCP server and are connected by a non-BOOTP router, verify the status of the DHCP Relay Agent. If the router is BOOTP-capable, then verify its status as well. This may seem obvious, but in the heat of the moment you can sometimes forget to check the basics, especially if you’re further distracted when users begin to complain that they can’t connect to the LAN.
Find the culprit
If running through the basic checklist doesn’t solve the problem, you’ll need to log on to the Windows 2000 server that is running the DHCP Server Service. Run through the basics on this machine as well, taking care to verify the physical integrity of the server and its components, particularly the network cable(s) and netcard(s). Then check the status of the DHCP Server Service and the IP address leases themselves. You can do this in the DHCP MMC by clicking Start | Programs | Administrative Tools | DHCP.
Next, check the Event Viewer to see if it has thrown up any error codes (event IDs) commonly associated with DHCP. These will most likely refer to JET database corruption. The DHCP database, which is a JET database called Dhcp.mdb and is located at %Systemroot%\System32\Dhcp, can sometimes become corrupted, and this will be enough to throw your DHCP server off course.
Big JET databases, those over 25-30 MB in size, are most prone to corruption, so bear this in mind if your installation has problems. If you think this might be a cause of the problem, check the available disk space because insufficient space will prevent the DHCP server from servicing clients. Some common JET database error codes you’ll see in Event Viewer include:
- The JET database returned the following Error: -510.
- The JET database returned the following Error: -1022.
- The JET database returned the following Error: -1850.
If you see any of these errors, your first course of action should be to perform an offline repair of the Dhcp.mdb file using a utility called Jetpack.exe. Jetpack is included on the Windows 2000 CD-ROM. To use Jetpack, open a command prompt on your Windows 2000 server and type jetpack database_name.mdb temporary_database_name.mdb, where database_name.mdb is the Dhcp.mdb file and temporary_database_name.mdb is any temporary database you create.
You can use the following commands to compact the DHCP database:
net stop dhcpserver
jetpack dhcp.mdb tmp.mdb
net start dhcpserver
In this example, Jetpack repairs and compacts the current Dhcp.mdb file into a new file called Tmp.mdb. It then deletes the original (corrupted) database file and renames Tmp.mdb, Dhcp.mdb. If your original Dhcp.mdb file was fairly big, you’ll notice that the file size has shrunk, maybe significantly, once Jetpack has run.
If this method succeeds and the DHCP Server Service starts smoothly, your clients should be able to reconnect to the LAN and successfully renew their IP address leases.
Is your DHCP server a Windows NT server?
Note that the inclusion of dynamic compaction in Windows 2000 minimizes DHCP database corruption, which simply means that the same operation is carried out periodically without stopping the DHCP Server Service. However, this is not the case in Windows NT. If you’re using a Windows NT server as a DHCP server, you must stop the DHCP Server Service before you run Jetpack.exe.
Well, that didn’t work--now what?
It’s quite possible that the above procedure won’t solve the problem. If it doesn’t, you have other options. Chief among these will be to restore a copy of the DHCP database from a backup. You can try this from a prior tape backup or from the \Backup folder under the \System32\Dhcp folder.
If you do take this route, be sure to stop the DHCP Server Service first, and then make a new backup copy of the entire current (and possibly corrupted) System32\Dhcp folder, preserving the folder hierarchy to a separate backup location or device from the one you’re going to restore from. Next, delete all the files in \System32\Dhcp and restore their equivalents from your secondary backup location. Then make sure to run the Jetpack.exe utility against the newly restored Dhcp.mdb file before restarting the service.
Once you’ve restarted the service and checked all the scope and lease information, you may see that not all of it is correct and up to date, and you may not see any of the active leases or reservations. This is because there’s an inconsistency between what’s in the backup and what’s in the server’s registry key.
To reconcile these two so that your DHCP database is consistent, open the DHCP console and choose Reconcile All Scope from the Action menu. This will open a dialog box called Reconcile Database in which you must click Verify. This runs through the DHCP database to check for inconsistencies. If it encounters any, it displays the relevant IP address information, which you need to select and then click Reconcile. Once the reconciliation is complete, the data is added back to the Active Leases for each scope. This information is based on the contents of the registry key atHKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\DHCPServer.
Restart the DHCP Server Service. If you subsequently notice that individual client lease information is incorrect, don’t worry too much, because this will be corrected the next time the clients renew their DHCP leases. In fact, I recommend that you aim to have all the clients renew their leases as soon as possible thereafter so that not only will your Active Leases and Reservations information be correct, but so that you can also make an immediate backup of the newly restored DHCP database.
Nope, it’s still not working
The remedies I’ve described all assume that your DHCP server is in good physical health. However, if this is not the case and you need to repair or replace some hardware, you’ll have to move the function of the DHCP server to another machine. This is not as complicated as it sounds, and you can do it fairly quickly and without the need to recreate a DHCP database from scratch. There are two phases to this procedure, the first on the source DHCP server and the second on the destination DHCP server.
First, on the source DHCP server, stop and disable the DHCP Server Service. Copy the DHCP folder hierarchy to a temporary location on the destination server. For example, copy %Systemroot\System32\Dhcp to C:\Tempdhcp. Start the Registry Editor (Regedt32.exe) and find the HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\DHCPServer key. Save this key to a text file using Save Key from the File menu. (Make sure you don’t choose Save Subtree As!) Finally, make sure this text file is available on the Destination DHCP server.
In the next phase, you must go to the destination DHCP server. Make sure the DHCP Server Service is installed and that the server has been rebooted. Stop the DHCP Server Service. Find the temporary location to which you copied the DHCP folder hierarchy in phase one. In this example, it would be C:\Tempdhcp.
Find the System.mdb file and rename it System.src. Move the \Sytem32\Dhcp folder hierarchy (containing the renamed file) into this server’s existing DHCP folder structure, thereby replacing it. Open the Registry Editor, and find and select the DHCPServer key, which is located in the HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\DHCPServer hive. Restore the registry key you saved as a text file in phase one to this location.
When you’re prompted for a File Name in the Restore Key operation, use this path: Systemroot\System32\Dhcp\Backup\Dhcpcfg. When prompted, click Yes, which will overwrite the current registry values with the ones you’ve transferred from the Source DHCP server.
Finally, close the Registry Editor and restart the DHCP Server Service. Open the DHCP Server console and choose the Reconcile All Scopes as I outlined earlier.
After doing so, you’ll have successfully moved and recreated a working DHCP database on a new server. Be sure to renew all client IP address leases and make a backup of the new DHCP server installation as soon as possible.
This Daily Drill Down would not be complete without a look at measures a LAN administrator can take to mitigate the effects of DHCP outages. There are some basic rules that you can apply when designing or upgrading a DHCP Server installation. Not only will they keep productivity high, but they’ll also make you look good when you report how easily you solved a problem.
Why put all your eggs in one basket? In the context of DHCP servers, the 80/20 rule refers to the common practice of splitting DHCP scope ranges into 80 percent and 20 percent portions, which are managed by different physical DHCP servers. That way, if one of the servers dies, clients can still get IP address information from the other machine, greatly reducing the impact of DHCP server outages. Big companies with very large scopes and superscopes often use this method. However, it can also be effective in medium-size and smaller firms.
Unless you use Windows 2000 DHCP servers, your company could be a victim of rogue DHCP servers. This is a situation in which a new DHCP server comes on line and starts servicing client IP requests. This can be, and often is, benign; for example, someone testing some functionality may have forgotten to disconnect the test machine from the LAN.
The potential for confusion is great—and occasionally comical—in such situations. Microsoft addressed the issue in Windows 2000 by making it compulsory to authorize DHCP servers in Active Directory, which only Enterprise Administrators can do.
If routers segment a LAN, then unless there’s a DHCP server on each segment, client DHCP Discover packets will have to cross the routers to find DHCP servers. If the router is BOOTP capable, then this is a nonissue. However, if the router is older and not BOOTP capable, then the subnet must have a DHCP Relay Agent installed to capture the DHCP Discover packets and forward them to the DHCP server. This is fine, but it adds another layer of complexity and management, thereby increasing both the chances of failure and the LAN administrator’s workload.
It goes without saying that whenever possible, the elements of a corporate network should be duplicated through fault tolerance. In DHCP terms, this means installing at least two DHCP servers, each of which is capable of stepping into the breach if required. You should also mirror physical DHCP fault tolerance in DHCP software by using the 80/20 rule for further protection.
Once you have a working Windows 2000 DHCP installation, you can turn on an extremely useful feature known as Conflict Detection. This feature will ping your LAN for a specific IP address before it adds it to the pool of available IP addresses the DHCP server will lease. You can turn it on by opening the DHCP console and selecting Properties from the Action menu.
When the properties window appears, click on the Advanced tab. Under Conflict Detection Attempts, enter a number greater than 0. This number sets the number of times a ping is sent to the LAN. You should use this feature sparingly because it ups the load on the server by increasing the DHCP server’s response time to client requests. Microsoft recommends that you set Conflict Detection Attempts value no higher than 2. Turn Conflict Detection off once you’re happy that your DHCP database is in good shape again.