Your server’s registry is the place where angels fear to tread. It’s a mysterious place of strange spellings, capitalization, and spacing, but it’s also the heart of your server’s configuration. If there’s a problem with your server’s registry, chances are your entire network will suffer. In this Daily Drill Down, I’ll describe some problems with the system hive of the registry and how to recover from them.
A registry refresher
The registry is a database that records and updates any and all settings changes you make on the computer. If you change your wallpaper or your display settings or install software on your server, Windows 2000 records these changes in the registry. The registry consists of subtrees, keys, values, data types, and hives.
Subtrees represent the logical structure of the registry. Keys, values, and data types are all the information contained in the registry subtrees. They are all part of the registry’s logical structure. Data types represent the kinds of data the registry is expected to record (for instance, simple text strings or binary information). Hives, on the other hand, represent the registry’s physical structure.
You can edit your server’s registry using two different utilities, Regedit (Regedit.exe) and Regedt32 (Regedt32.exe). Other than appearance, the difference between the two registry editors is fairly small. Principally, Regedt32 offers a security feature called Read Only Mode that, when selected from the Options menu, allows users to only read the registry, not edit it. This is to minimize the risk of users making small mistakes with big consequences. Regedit doesn’t have this feature, but it does offer a direct path to all of the subtrees, rather than requiring you to jump from window to window. Even though both tools look different and have subtle internal differences, they’re referred to by the same name, the Registry Editor.
The system hive holds your configuration settings
The system hive contains the system configuration information a machine needs in order to boot up correctly—which drivers to load, information on hardware profiles, what services to start, and which software settings to implement. You’ll find the system hive in the Registry Editor at Hkey_local_machine\system. The system hive stores configuration data in the CurrentControlSet subkey.
Just like any database file, the system hive exists as a physical file on your server’s hard drive. You’ll find it in the %systemroot%\system32\config directory. If you look in this directory, you’ll see a bunch of files, including one called System and another called System.alt. System is the system hive, and System.alt is a complete copy of the system, for fault-tolerant purposes.
Problems with the system hive
Various limitations to the size of the system hive file can cause problems. To load and execute properly, the system hive file should be no larger than 13 MB in size. If it’s larger, the Windows 2000 boot process will fail.
This limit exists because the system hive loads in a low-level environment in which only 16 MB of RAM is available to the boot process. The system hive loading process must share that 16 MB with the NT loader, the NT kernel, the HAL, and any required boot drivers. On a typical server, that doesn’t leave much RAM to spare, so a big system hive can cause the boot process to fail.
The system hive file can be quite large even in an ordinary machine. For example, one of the machines I run is a laptop with various standard applications. It’s not a heavy-duty machine by any means, but the system hive is 2.6 MB. Another machine I run as a testing server has a system hive of 5.5 MB, and it’s not doing anything really fancy either.
Add to this the fact that, as the system hive grows, it becomes fragmented. Fragmentation is bad in itself, but it also leads to file corruption. If all this happens to your system hive, you’re going to have a dead server on your hands before too long.
There’s a way to limit the size of the system hive so that it doesn’t grow beyond 13 MB; however, it doesn’t work with Windows 2000 domain controllers. It may work with servers running a significant number of shared resources and whose registries are too big. If your server falls into this category, you can find the registry entries you need to make by checking out the Microsoft Knowledge Base.
Restoring the system hive
If a server fails on boot because of system hive problems, there are various approaches you can take to remedy the situation and get your server running again. These involve the Emergency Repair Disk, the Windows 2000 Recovery Console, and the Emergency Repair Process.
The Emergency Repair Disk
Repairing the system hive so the computer will boot to a usable state is a relatively simple matter, but how much work you’ll have to do once you’re up and running again depends on a recent copy of the registry. That’s why the ERD process is so important. If you have an ERD, you have a recent copy of the system hive, and you need only do a few things to restore the hive. If you don’t have an ERD, then you’ll have some more work to do to get your server back in shape.
Of course, in order to use an ERD, you need to have made one in the first place. Many administrators put this task off because server configurations can and do change. However, if you haven’t made an ERD yet, you should make one at the earliest possible opportunity—such as right after you finish reading this article. The ERD is not a boot diskette—you use it in conjunction with the Windows 2000 Recovery Console.
You can make an ERD by using the Windows 2000 Backup program. To start the process, click Start | Run. When the Run dialog box opens, type ntbackup and click OK. When the Backup program starts, click the Emergency Repair Disk button and follow the prompts. When you see the Emergency Repair Diskette screen, check the Also Backup The Registry box.
The ERD writes files to a floppy, and it also creates a folder called Regback on your server’s hard drive in the %systemroot%\repair\regback folder. This folder contains the most recent copy of files copied to the ERD, including the latest version of the system hive file. You should create a new ERD every time you apply a service pack update to the system or a driver. This ensures that the ERD has a fresh copy of the system hive.
The Windows 2000 Recovery Console
The Recovery Console is a tool you can use for advanced administrative purposes. You can run it from the Windows 2000 CD at boot, or you can install it onto a server. If you haven’t previously installed the Recovery Console on your server, I highly recommend that you do so.
To install the Recovery Console, insert the Windows 2000 CD into the CD-ROM drive. Open a command prompt and type drive:\i386\winnt32.exe /cmdcons, where drive is the drive letter for the CD-ROM drive. Click Yes to start the installation procedure. Then restart the server.
The next time you boot the server, the Microsoft Windows Recovery Console will appear as a choice on the server’s boot menu. If you want to start the repair process, you can select this choice; otherwise let the server boot as normal.
You can also run the Recovery Console from the Windows 2000 CD at boot time. Make sure your server’s BIOS is set to boot from the CD-ROM drive first. Once you get to the text-mode portion of Setup, you’re prompted to install Windows 2000 or press R to repair an existing installation. Naturally, you’ll press R, because you don’t want to completely reinstall Windows 2000.
The next screen you’ll see asks you to choose between using the Recovery Console or the Emergency Repair Process. What’s the difference? The Recovery Console presents you with a command line starting in the %systemroot% directory. As we’ll see, you can repair your server this way by renaming files.
The Emergency Repair Process
By contrast, the Emergency Repair Process presents two options to fix your system: Manual Repair and Fast Repair. Manual Repair inspects the Windows 2000 startup environment, verifies Windows 2000 system files, and inspects the boot sector. It doesn’t check the registry files. Choose the Fast Repair option and, assuming that %systemroot\repair is accessible, the registry files will be checked.
If you can’t start your server because of a problem with the system hive, then you’ll likely see a message such as “Error Message: Windows Could Not Start Because the Following File Is Missing or Corrupt: \Winnt\System32\Config\System.ced.” You’ll also see this error message if you’ve installed a Promise ATA66 IDE PCI controller card so, if you have, remove it and try again.
Restoring the system hive with an ERD
To restore the system hive with a recently created ERD, start an instance of the Recovery Console. At the command prompt, type the following commands, pressing [Enter] after you type each command:
ren system system.old
ren system.alt systemalt.old
copy system c:\winnt\system32\config
During the process, you’ll rename the corrupted System and System.alt hive files and replace them with the most recent version of the system hive from your ERD. While you’re doing all this, you’ll need your copy of the ERD because the repair process looks to it for various files, the most important of which is Setup.log, a record of all installed files with their cyclical redundancy check (CRC) data. With this information, Regback can restore your old system hive.
In a perfect world, you’ll have created your ERD within the last 48 hours, but in the real world, you’ll probably have to deal with a server running a registry configuration that was backed up sometime between the original Windows 2000 OS installation and the present. Reboot the server and, with any luck, you won’t have too much software to reinstall.
Restoring the system hive without an ERD
If you don’t have a recent copy of the system hive saved to an ERD, you have two options for fixing the damaged system hive on your server: Fast Repair and the Recovery Console.
Using Fast Repair
You can run Fast Repair from the Emergency Repair Process screen described above, but before you do, make absolutely sure you have no other choice. If you run Fast Repair on a Windows 2000 domain controller, Windows 2000 will activate the system hive as it was when you first installed the operating system.
Fast Repair looks in %systemroot%\repair for the requisite files, but these files won’t have been updated to include any additional programs you installed on your server. For a domain controller, this includes things such as Active Directory. When Fast Repair finishes, you must reboot and reinstall all necessary software and drivers. When you finally get the domain controller going again, be sure to back up the system state daily so you can restore the Active Directory from Directory Services Restore Mode.
Using the Recovery Console
You may also be able to save the day from the Recovery Console. When the Recovery Console command prompt appears, type the following commands, pressing [Enter] after you type each command:
ren system system.old
ren system.alt system
Remember that System.alt is a complete copy of the CurrentControlSet and thus is the most up-to-date version of the system hive. By renaming it, you’re activating it as the new master copy of the system hive. Try to reboot. If it works, you’re done. This procedure should work, but it may not if the corruption/fragmentation of the system hive file has also compromised System.alt.
If your system fails to boot because the system hive is too big, then this approach won’t help at all, because System.alt is a direct copy of System; no amount of renaming will change its size. The best you can do now from the Recovery Console is type the following commands, pressing [Enter] after you type each one:
ren system system.old
ren system.alt systemalt.old
copy system c:\winnt\system32\config
This will rename the old System and System.alt files and copy the original system hive file from %systemroot%\repair back into %systemroot%\system32\config. Then you must reboot and reinstall all necessary software and drivers.
Don’t break out in hives
Just because you’ve encountered a problem with your server’s system hive, there’s no reason to panic. The system hive is critically important, but if you’ve taken time to create an Emergency Repair Disk, you can get things working in short order. Even if you don’t have an ERD, you’re not out of luck. With a little extra work, you can get your server up and running again and not have to worry about data loss.