Enterprise Software

Providing DHCP fail-over in Windows NT, part 2: The code

Richard Charrington illustrates how to provide for continuous DHCP services on your network. He also shows you the code for implementing a successful backup transfer of your network's resources.
Providing DHCP fail-over in Windows NT, part 2: The code
In “Providing DHCP fail-over in Windows NT, part 1 ,” we described a method of providing a continuation of the DHCP service in the event that the live DHCP server crashes and is unrecoverable before leases start to expire and users are unable to get an IP address. In part 2, I’ll explain the code that I used to implement this feature. You’ll want to review part 1 so that you can understand what the code is attempting to achieve.

The code explained
Below is the batch file that’s called by the scheduler. The batch file needs to be scheduled only once on the live server. When it runs, it will schedule the same batch file to run on the standby server and will re-schedule the live server to run the batch file again after the next DHCP backup.
Please note that line numbers are for identification only and are not part of the code.@echo off
::> ------------------------------------
::> --- Backup Live DHCP Jet Database to Standby DHCP server
::> --- Author: R. Charrington – rcharrington@iname.com
::> --- Created: 16/6/99
::> --- Updated: 2/7/99
::> ------------------------------------
::>=====================================
::> This program is called by the scheduler and executes the main program
::> Put the name of the live server and standby server in place of
::> DHCP1 and DHCP2.
::>=====================================
1. Set LIVE=DHCP1
2. Set STANDBY=DHCP2
3. Set Other=%STANDBY%
4. sc %LIVE% query dhcpserver | find /i "running" || set STANDBY=%LIVE% & set LIVE=%Other%
5. c:\batch\dhcp\dhcpbak.bat 1> c:\batch\dhcpbak.log 2>&1The action and purpose of this batch file is explained in the header (the lines that start with ::>). Lines 1 and 2 set the environment variables LIVE and STANDBY. Line 4 checks to see if what is specified as the live server is actually running the DHCP Server service. If the DHCP Server service isn’t running, the batch file swaps the values of LIVE and STANDBY.Below is the batch file that begins the automation of the process. The same batch file runs on both the live and the standby server so that there is only one batch file to maintain. If the DHCP Server service is running, the script assumes that it’s on the live server and runs the relevant portion of code. In the same way, if the DHCP Server service is not running, the script runs the code for the standby server. Of course, it’s possible for both servers to have the DHCP Server service running, and there is code to handle this “error” condition.
Following the code, we’ll provide an explanation of what’s going on.
@ECHO OFF
::> -------------------------------------
::> --- Backup Live DHCP Jet Database to Standby DHCP server
::> --- Author: R. Charrington
::> --- Created: 16/6/99
::> --- Updated: 2/7/99
::> -------------------------------------
::=======================================
:: This file must sit in C:\%HomeDir%\dhcp (see below for setting HomeDir param) on at least
:: one of the DHCP servers in a failover set.
:: The set is defined in the calling script "at-dhcp.bat".
:: One part of the script below will replicate the contents of the above directory
:: to the the DHCP partner.
::=======================================
1. set HomeDir=cr-utils
2. c:
3. cd \%HomeDir%
4. ::> --- Check to see at least one DHCP server is live
5. sc \\%LIVE% query dhcpserver | find /i "running" || sc \\$STANDBY% query dhcpserver | find /i "running" || GOTO NoDHCP
6. ::> --- Synchronize the batch job area (/xo = eXcluding Older files)
7. robocopy c:\%HomeDir%\dhcp \\%STANDBY%\c$\%HomeDir%\dhcp *.* /xo /z /r:8 /w:15
8. robocopy \\%STANDBY%\c$\%HomeDir%\dhcp c:\%HomeDir%\dhcp *.* /xo /z /r:8 /w:15
9. ::> --- If not on the LIVE DHCP server, do the STANDBY stuff
10. sc query dhcpserver | find /i "running" || goto STANDBY
11. ::> --- Check to see that the other server is accessible
12. ping -w 3000 -n 1 %STANDBY% | FIND /i "Reply from" || GOTO NoStandby
13. ::> --- This is the LIVE DHCP server, but let's check to see the other isn't live too
14. sc \\%STANDBY% query dhcpserver | find /i "running" && goto BothRunning
15. ::> --- Copy Backup contents to Backup1 on LIVE
16. robocopy c:\winnt\system32\dhcp\backup c:\dhcp-backup1 *.* /z /s /mir /r:8 /w:15
17.
18. ::> --- Create a backup of the DHCPServer registry hive
19. if exist DHCPServer del DHCPServer
20. reg save HKLM\SYSTEM\CurrentControlSet\Services\DHCPServer DHCPServer
21. copy DHCPServer c:\dhcp-backup1
22. ::> --- Scheduling re-run for 20 minutes after last DHCP server backup
23. ::> --- This allows 5 minutes "grace" so as not to interfere with
24. ::> --- the DHCP server backup
25. set TIME=22
26. dir c:\winnt\system32\dhcp\backup\DhcpCfg /l | awk "/dhcpcfg/{print $2}" | awk time.awk time=%TIME% f="at %%02d:%%02d c:\%HomeDir%\dhcp\at-dhcp.bat"
27.
28. ::> --- Check to make sure the job isn't scheduled for tomorrow
29. ::> --- This can happen after a failover
30. at | find /i "at-dhcp" | find /i "today" && goto SkipNewAt
31. at | find /i "at-dhcp" | awk "{system(c $1 d)}" c="at " d=" /d"
32. MySoon \\%LIVE% 20 c:\%HomeDir%\dhcp\at-dhcp.bat
33. :SkipNewAt
34. ::> --- Push Backup1 to STANDBY Backup1
35. robocopy c:\dhcp-backup1 \\%STANDBY%\C$\dhcp-backup2 *.* /z /s /mir /r:8 /w:15
36. ::> --- Schedule this batch file to run on STANDBY right away
37. soon \\%STANDBY% 60 c:\%HomeDir%\dhcp\at-dhcp.bat
38. REM Done
39. GOTO :EOF
40. :STANDBY
41. ::> -- On Standby server
42. ::> --- This section runs only on the STANDBY DHCP server
43. ::> -------- WE COULD CHECK HERE THAT THE SERVICE IS RUNNING ON THE OTHER SERVER, AND, IF NOT, THEN START IT HERE
44. ::> --- Copy the Jet database to its proper location
45. ::> --- Must have files in backup\Jet\new as well
46. robocopy c:\dhcp-backup2\jet\new c:\winnt\system32\dhcp *.* /z /s /mir /xf /r:8 /w:15
47. robocopy c:\dhcp-backup2 c:\winnt\system32\dhcp\backup *.* /z /s /mir /xf /r:8 /w:15
48. ::> --- Restore Registry key on STANDBY server
49. ::> --- REG RESTORE doesn't recognise pathname for file, so have to
50. ::> --- CD to the right path first
51. cd \dhcp-backup2
52. ECHO y | reg restore DHCPServer HKLM\SYSTEM\CurrentControlSet\Services\DHCPServer
53.::> --- Disable the service (enabled by the registry restore)
54. sc config DHCPServer start= disabled
55. ::> --- Check to see that the other server is accessible
56. ping -w 3000 -n 1 %STANDBY% | FIND /i "Reply from" || GOTO NoStandby
57. ::> --- Schedule this on the LIVE server if it's not already there
58. at \\%STANDBY% | find /i "at-dhcp" || soon \\%STANDBY% 60 c:\%HomeDir%\dhcp\at-dhcp.bat
59. GOTO :EOF
60. :BothRunning
61. echo It appears that the DHCP service is running on both %LIVE% and %STANDBY%. Please check. > msg.txt
62. call :Email
63. REM Done
64. GOTO :EOF
65. :NoDHCP
66. ::> --- Email everyone
67. echo It appears that the DHCP service is not running on %LIVE% or %STANDBY%. Please check. > msg.txt
68. call :Email
69. ::> and run job again
70. soon 900 c:\%HomeDir%\dhcp\at-dhcp.bat
71. REM Done
72. GOTO :EOF
73. :NoStandby
74. echo It appears that %STANDBY% is not responding. Please check. > msg.txt
75. ::> --- Email everyone
76. call :Email
77. ::> --- Re-schedule this batch file to run again in 15 minutes
78. Soon 900 c:\%HomeDir%\dhcp\at-dhcp.bat
79. REM Done
80. Goto :EOF
81. :Email
82. REM Replace ‘admin@company.com’ with the name of the person or persons to receive error messages.
83. Blat admin@company.com file="msg.txt" from="%LIVE%" subj="DHCP: Problem"
84. goto :EOF
A few notes
The following utilities can be found in the NT Resource Kit:
  • Sc: Service control
  • Reg: Registry handling
  • Robocopy: Xcopy with attitude
  • Soon: At enhancement
Awk and Blat aren’t supplied with Windows NT or the Windows NT Resource Kit.Awk is a utility that has come from the UNIX world. It’s extremely useful in helping you get exactly what you want from the output of other commands or for combining variable input to build commands. Awk is a scripting language in its own right. Awk.awk is an awk script, where the awk commands are held in a file (see line 26) rather than being part of the command line (see line 31).Blat is a program that allows you to e-mail from the command line or batch file.MySoon is a variation of Soon. It’s a batch script that incorporates Awk, which schedules a job in n minutes rather than in seconds. In addition, it works correctly on remote servers in different time zones, while Soon does not.The variables LIVE and STANDBY refer to the server running the script and the other DHCP server, respectively. The script will determine which is the live server and which is the standby.The time interval of 22 minutes (line 25) assumes that the DHCP Backup interval hasn’t been changed from the default of 15 minutes. If it’s different, change the time value in line 25 to the DHCP Backup interval plus half of the DHCP Backup interval. (For example, if the Backup interval is set to 30 minutes, set TIME to 45 minutes—or 30 + 15.)
Annotations
Let’s give the script a close look.
1. set HomeDir=cr-utils
2. c:
3. cd \%HomeDir%
4. ::> --- Check to see at least one DHCP server is live
5. sc \\%LIVE% query dhcpserver | find /i "running" || sc \\$STANDBY% query dhcpserver | find /i "running" || GOTO NoDHCPLine 1 sets a variable to point to the path (on the C: drive) where this batch file, Awk, Blat, Time.awk, and MySoon are held.Line 5 uses sc (Service Control—NT Reskit) to query the status of the DHCP Server service. The returned status is checked, by find, for the word “running”. If it’s NOT found (the two pipe characters “||” equate to “if not”), then we check the other server in the same way. If the DHCP Server service isn’t running on there, then an error message is e-mailed to the administrator.6. ::> --- Synchronize the batch job area (/xo = eXcluding Older files)
7. robocopy c:\%HomeDir%\dhcp \\%STANDBY%\c$\%HomeDir%\dhcp *.* /xo /z /r:8 /w:15
8. robocopy \\%STANDBY%\c$\%HomeDir%\dhcp c:\%HomeDir%\dhcp *.* /xo /z /r:8 /w:15Line 7 and 8 synchronize the batch file directory on both servers. This action allows changes to the batch file to be made on either server. Since it’s one of the first things that the batch file does, any changes will be incorporated immediately.9. ::> --- If not on the LIVE DHCP server, do the STANDBY stuff
10. sc query dhcpserver | find /i "running" || goto STANDBY
11. ::> --- Check to see that the other server is accessible
12. ping -w 3000 -n 1 %STANDBY% | FIND /i "Reply from" || GOTO NoStandbyLine 10 checks the status of the DHCP Server service on the local machine. If this line doesn’t return “running”, then that means that it must be the standby DHCP server, and the code branches off to the relevant label.If it’s the live DHCP server, line 10 pings the other server to ensure that it’s online. (We’re going to try to send it some files.) If it’s not online, we e-mail the administrator.13. ::> --- This is the LIVE DHCP server, but let's check to see the other isn't live too
14. sc \\%STANDBY% query dhcpserver | find /i "running" && goto BothRunning
15. ::> --- Copy Backup contents to Backup1 on LIVE
16. robocopy c:\winnt\system32\dhcp\backup c:\dhcp-backup1 *.* /z /s /mir /r:8 /w:15
17.
18. ::> --- Create a backup of the DHCPServer registry hive
19. if exist DHCPServer del DHCPServer
20. reg save HKLM\SYSTEM\CurrentControlSet\Services\DHCPServer DHCPServer
21. copy DHCPServer c:\dhcp-backup1We know the batch file is running on the live DHCP server, but we also need to check if the DHCP Server service is running on the other server. This check is accomplished by line 14. If the service is running on both servers, we e-mail the administrator.If all is well, line 16 uses Robocopy to mirror the DHCP backup directory to a holding area. We do it this way, rather than copying it straight to the other server. Robocopy is very quick, and there’s no chance of it still copying when the DHCP service backup runs.Line 20 creates a copy of the DHCP Server service registry hive. It will fail if the file to which it is saving already exists—so in line 19, we make sure that it does not.Line 21 copies the file that was created in line 18 to the holding area.22. ::> --- Scheduling re-run for 20 minutes after last DHCP server backup
23. ::> --- This allows 5 minutes "grace" so as not to interfere with
24. ::> --- the DHCP server backup
25. set TIME=22
26. dir c:\winnt\system32\dhcp\backup\DhcpCfg /l | awk "/dhcpcfg/{print $2}" | awk time.awk time=%TIME% f="at %%02d:%%02d c:\%HomeDir%\dhcp\at-dhcp.bat"
27.
28. ::> --- Check to make sure the job isn't scheduled for tomorrow
29. ::> --- This can happen after a failover
30. at | find /i "at-dhcp" | find /i "today" && goto SkipNewAt
31. at | find /i "at-dhcp" | awk "{system(c $1 d)}" c="at " d=" /d"
32. MySoon \\%LIVE% 20 c:\%HomeDir%\dhcp\at-dhcp.batLine 26 pipes the result of a dir command to awk, which pipes the time field to another awk command. This awk command adds the number of minutes (set in line 25) to the time, formats an at command, and executes it. The at command can be seen between the double quotes after f= on line 26. The first %%02d will be replaced by the hour, and the second one by the minute.Line 30 checks the output of an at command to ensure that the job just scheduled is scheduled for today (that is, the time calculated was not before the present time). If it’s scheduled correctly for today, then skip the next line (&& equates to If the previous command is true/successful, then move on to the following command). If the job is scheduled for tomorrow (see “Providing DHCP fail-over in Windows NT, part 1” for an explanation of how it can happen), then line 31 removes it and line 32 reschedules it for 20 minutes from now (rather than using the file date/time field).33. :SkipNewAt
34. ::> --- Push Backup1 to STANDBY Backup1
35. robocopy c:\dhcp-backup1 \\%STANDBY%\C$\dhcp-backup2 *.* /z /s /mir /r:8 /w:15
36. ::> --- Schedule this batch file to run on STANDBY right away
37. soon \\%STANDBY% 60 c:\%HomeDir%\dhcp\at-dhcp.bat
38. REM Done
39. GOTO :EOFOnce this job is scheduled to run again, line 35 mirrors the holding area to a holding area on the standby server.Line 37 schedules this job to run on the standby server.GOTO :EOF ends the batch file; we have completed the live server work. :EOF is a psuedo-label that means “end of file.”40. :STANDBY
41. ::> -- On Standby server
42. ::> --- This section runs only on the STANDBY DHCP server
43. ::> -------- WE COULD CHECK HERE THAT THE SERVICE IS RUNNING ON THE OTHER SERVER, AND, IF NOT, THEN START IT HERE
44. ::> --- Copy the Jet database to its proper location
45. ::> --- Must have files in backup\Jet\new as well
46. robocopy c:\dhcp-backup2\jet\new c:\winnt\system32\dhcp *.* /z /s /mir /xf /r:8 /w:15
47. robocopy c:\dhcp-backup2 c:\winnt\system32\dhcp\backup *.* /z /s /mir /xf /r:8 /w:15The standby server work begins here.Line 46 mirrors the directory in the holding area that contains the DHCP Jet database to the dhcp directory. Line 47 mirrors the holding area to the dhcp backup directory. This step is the essence of this procedure.48. ::> --- Restore Registry key on STANDBY server
49. ::> --- REG RESTORE doesn't recognize pathname for file, so have to
50. ::> --- CD to the right path first
51. cd \dhcp-backup2
52. ECHO y | reg restore DHCPServer HKLM\SYSTEM\CurrentControlSet\Services\DHCPServer
53.::> --- Disable the service (enabled by the registry restore)
54. sc config DHCPServer start= disabledLine 52 effectively “mirrors” the DHCP Server service registry hive from the one that was created on the live server. This procedure is necessary because the hive contains static addresses and other non-default configuration information. The Reg command won’t recognize a filename that includes a path, which is why, in line 51, we change to the directory containing the registry file. Also, the Reg command prompts for confirmation, which is why echo y is piped to it.As the registry hive comes from the live server, the DHCP Server service Startup parameter is set to Automatic. Line 54 changes it to Disabled (it could be set to demand) so that the service won’t start if the standby server is rebooted for any reason.55. ::> --- Check to see that the other server is accessible
56. ping -w 3000 -n 1 %STANDBY% | FIND /i "Reply from" || GOTO NoStandby
57. ::> --- Schedule this on the LIVE server if it's not already there
58. at \\%STANDBY% | find /i "at-dhcp" || soon \\%STANDBY% 60 c:\%HomeDir%\dhcp\at-dhcp.bat
59. GOTO :EOFLine 56 checks to see that the live server is responding. (We are about to access it.) If there is no ping response, then we e-mail the administrator.Line 58 checks to see that this job is scheduled. If not ('||'), it will be scheduled to run in one minute.60. :BothRunning
61. echo It appears that the DHCP service is running on both %LIVE% and %STANDBY%. Please check. > msg.txt
62. call :Email
63. REM Done
64. GOTO :EOF
65. :NoDHCP
66. ::> --- Email everyone
67. echo It appears that the DHCP service is not running on %LIVE% or %STANDBY%. Please check. > msg.txt
68. call :Email
69. ::> and run job again
70. soon 900 c:\%HomeDir%\dhcp\at-dhcp.bat
71. REM Done
72. GOTO :EOF
73. :NoStandby
74. echo It appears that %STANDBY% is not responding. Please check. > msg.txt
75. ::> --- Email everyone
76. call :Email
77. ::> --- Re-schedule this batch file to run again in 15 minutes
78. Soon 900 c:\%HomeDir%\dhcp\at-dhcp.bat
79. REM Done
80. Goto :EOFLines 60 to 80 are the error-handling routines.Lines 60 to 64 send an e-mail message to the administrator if the DHCP Server service is found to be running on both the live and standby servers. On line 62, the label :Email is called as if it were a separate batch file. This use of the call command is available if Command Extensions are enabled in NT (which they are by default). Just as in a batch file, an end-of-file would return to the line after the call.Lines 65 to 72 provide the error message e-mail in case the DHCP Server service isn’t running on either server. Once the message is sent, line 70 reschedules this job to run again in 15 minutes.Lines 73 to 78 provide the error message e-mail in case the other server isn’t responding. Once the message is sent, line 78 reschedules this job to run again in 15 minutes.81. :Email
82. REM Replace ‘admin@company.com’ with the name of the person or persons to receive error messages.
83. Blat admin@company.com file="msg.txt" from="%LIVE%" subj="DHCP: Problem"
84. goto :EOFLines 81 to 84 contain the e-mail subroutine. Goto :EOF on line 84 equates to a return statement.
Conclusion
That is the way that I did it, but I’m sure that there are other ways, such as coding it all in Perl, for example.
If you find it interesting and useful but don’t want to go to the trouble of typing everything yourself, just send me an e-mail and I will return a copy of the batch files, Awk, Blat, MySoon, and Time.awk. Then, you can just put in the names of your servers and the e-mail address that will receive error messages, and you’re all set.Richard Charrington’s computer career began when he started working with PCs—back when they were known as microcomputers. Starting as a programmer, he worked his way up to the lofty heights of a Windows NT Systems Administrator, and he has done just about everything in between. Richard has been working with Windows since before it had a proper GUI and with Windows NT since it was LANManager. Now a contractor, he has slipped into script writing for Windows NT and has built some very useful auto-admin utilities.The authors and editors have taken care in preparation of the content contained herein, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for any damages. Always have a verified backup before making any changes.
0 comments

Editor's Picks