When a Windows 2000 system fails to boot, the failure typically produces the infamous Blue Screen of Death. When that happens, it's usually safe to assume that if the system boots into Safe Mode, either a critical service or driver is failing to start. You can use a combination of the Blue Screen of Death error message and what you know about the way that Safe Mode works to help you determine the cause of the boot failure.
When most people see the Blue Screen of Death, they simply swear at the machine a few times and then reboot, hoping that the problem will go away. However, the Blue Screen of Death actually contains some really valuable information. The only trick is to know how to read it.
Two types of blue screens
The first thing that you need to know about the Blue Screen of Death is that there are actually two different types, one indicating hardware problems and another indicating software woes. The software version of the Blue Screen of Death is the most common. It only means that Windows 2000 encountered an error and the OS couldn't recover from it. It’s very possible that this error may be related to a hardware problem that wasn’t severe enough to generate an error initially, but is bad enough to prevent Windows 2000 from loading. When a software blue screen message is generated because of faulty hardware, it’s almost always related to bad memory.
But you shouldn't assume that a software-related blue screen message points to only a memory problem. There are plenty of other causes for blue screen messages. For example, blue screen messages are frequently generated by invalid, missing, or corrupt device drivers. Likewise, these messages can also be caused by incompatible software, especially antivirus software that wasn’t specifically designed for Windows 2000.
The second type of blue screen message is the result of a hard disk error that is generated by faulty hardware. There’s a good chance that you’ll never see the hardware blue screen. Windows displays this type of blue screen only when a very serious hardware problem exists. For example, if you’ve got a system with multiple processors, and the processors are mismatched, you’ll see the blue screen. Whatever the cause though, the information that’s found on the blue screen is critical to helping you solve your problem.
Test before you begin
Before you get started with the troubleshooting techniques that I suggest here, you should complete some initial testing. Sure, you can jump right into the more hard-core testing methods if you like, but it’s usually much easier to do a little trial-and-error work first so that you can eliminate some of the more common causes of boot failure.
In this article, I’m assuming that your system is booting into Safe Mode with no problems. If this is the case, the blue screen is probably the result of either a critical driver or a critical service failing to initialize during boot up. As always, the first thing that you should do is to retrace your steps and try to remember if anything has changed recently. If you can’t remember or if you’re working with a system that someone else may have altered, then it’s time to use the process of elimination.
At this point, try to boot the system into VGA Mode. If VGA Mode works, then the problem is an invalid or corrupt video driver. Replace the video driver and you’re back in business. If the system fails to boot into VGA mode, then you can rule out the video driver as a single point of failure. Keep in mind that this doesn’t guarantee that your video driver is correct or that your video card is functional. Sometimes, a system can have multiple problems. For example, on some older systems it’s possible for multiple hardware devices to be set to a common IRQ, DMA, or base memory address. When this happens, the symptoms can be confusing because you’re actually dealing with two separate problems instead of just one.
If the VGA Mode fails to boot, the next step is to attempt to boot Safe Mode With Networking Support. If the system boots successfully, then you can rest assured that the problem exists with either your network card driver or with one of the networking services (such as the TCP/IP NetBIOS Helper Service).
Now that you’ve performed some preliminary diagnostics, it’s time to identify the problem. I recommend booting back into Safe Mode With Networking Support and then making a note of what types of hardware are loaded in your system. Next, download the latest drivers from the Internet for all of your hardware devices. When the driver downloads are complete, open the Control Panel and double click on the System icon. Select the System Properties sheet, click the Hardware tab, then Device Manager.
The Device Manager contains a list of all hardware installed on your system. Expand each hardware category to reveal the actual hardware devices. Double click a device to access the device’s properties sheet, select the Driver tab, and select Update Driver. At this point, Windows will launch the Upgrade Device Driver Wizard. Just follow the wizard to update.
When you finish updating the device driver, update the driver for every other major hardware component in your system. Don’t worry about things like hard drives, CD-ROM drives, or USB ports, because these devices typically use universal drivers. You should make sure that things like network drivers, sound card drivers, and video drivers get updated.
When you’ve updated the driver for each device, reboot the system to see if the system will boot normally. If the system still fails to boot, then you need to narrow down the problem so that you know exactly what is causing the boot failure.
How to pinpoint the problem
Boot the system into Safe mode and go back into Device Manager if updating drivers doesn't solve the problem. Now, select a major hardware device (anything but the network adapter and the video adapter, since you’re already relatively sure that those work because of the tests that you did earlier). Double click on the device to access its properties sheet. Click the General tab, select Do Not Use This Device (Disable) from the Device Usage drop-down list, and click OK. Doing so tells the system not to load the driver for this device at start up. Now repeat the process for each device in the system except for the video adapter and the network card.
At first it may seem odd to disable all of your hardware devices, but remember that if the system starts in Safe Mode but won’t start in normal mode, it means that there’s either a critical device driver failure or a critical service failure. By disabling all of the device drivers that aren’t absolutely necessary, you can make some important discoveries about what’s causing the problem.
Once you’ve disabled all the major device drivers, reboot the system normally. If you're lucky, the system will boot. If the system does boot up, then one (or more) of the device drivers you’ve disabled is causing the problem. You can narrow things down by enabling one device driver and rebooting the system. Repeat the process, enabling one additional device driver after every reboot, until you know which driver is causing the failure.
If the system failed to boot with all of the device drivers disabled, then consider whether the problem is being caused by a conflict between the network adapter and the video adapter. Try disabling the video driver and/or the network driver and experiment with various combinations of enabled and disabled drivers until you find the problem.
If you determine that the video driver and the network driver aren’t to blame, then you’ve got a bigger problem on your hands. More than likely, one of the critical services is breaking down. In this case, you’ll have to rely on information found on the Blue Screen of Death to figure out where your problem lies. The real trick to doing so is to understand how to read the Blue Screen of Death’s error messages.
How to read blue screen messages
The first time that I ever looked at a Blue Screen of Death, I was horrified at the idea of having to make sense of all of the gibberish on the screen. The only real way of getting around the blue screen at this point is to locate another machine with EXACTLY the same hardware configuration as the one that you’re troubleshooting. Many blue screen errors are caused by a hardware failure. Therefore, you can take the hard drive out of the failing system and place it into a known good system.
If the known good system boots successfully off the hard drive from the ailing system, then you can be sure that the system you’ve been working on has some sort of hardware problem. If the known good system fails to boot using the hard drive from the failing system, then the problem is obviously software related.
Now that I’ve shown you that one last trick, let’s take a look at the blue screen error. There are actually several different types of blue screen errors. In this article, I'll focus on those most related to troubleshooting a boot failure.
Take a close look at the blue screen error in Figure A, which shows a number of different types of information.
|The Blue Screen of Death can provide clues to help troubleshoot a boot problem.|
As I mentioned, there's a lot of information shown on a blue screen error message. However, for the purpose of solving a boot problem, the most important things to look at are the second and third lines of text. The third line of text in Figure A tells you the file that was being loaded at the time of the failure was WDMAUD.SYS.
If you were troubleshooting a system and got this error, there’s a good chance that you may not know what the WDMAUD.SYS file (or whichever file happens to be listed) does. If you don’t recognize the file listed, try using the Windows search feature to locate the file. Many times the file’s location within the directory structure will provide you with a clue about file’s purpose.
If the file’s location offers no clue, try using the Microsoft Knowledge Base, which I used to identify WDMAUD.SYS as the Windows Driver Model Audio unit. During the search process, I also located a Knowledge Base article that suggested that my blue screen error may have been be related to a DirectX problem.
Blue screens aren’t all bad
The Blue Screen of Death may seem like an annoyance, but it can be helpful if you know how to read it using a systematic approach.