Two weeks ago, I asked for your help debugging a nasty Blue Screen of Death in Windows 2000 Professional. I've encountered BSODs before, but this one was particularly hard to solve. Every one to five days, seemingly at random, my dual-CPU system crashed, hard, as if it had smashed at full speed into a big blue wall. The problem wasn't reproducible, but the trigger was always the same: I would click on a perfectly normal URL and—kaboom!—the screen turned blue.
The Stop message was one I'd never seen before (and I thought I'd seen them all): BAD_POOL_CALLER. None of the NT/W2K experts who responded nailed this particularly tough challenge, but two TechRepublic members offered suggestions that provided important clues. And the whole exercise turned into a great refresher course in dealing with BSODs.
Rule #1: Check the KB. The first wave of responses came from a handful of TechRepublic members who all hit the KnowledgeBase within minutes after my question was posted. A simple search found that stop code in exactly one KB article (Q258069, "'Stop 0xC2 BAD_POOL_CALLER' Error Message on Print Server"), leading each to conclude that he had the problem licked. One especially confident TechRepublic member sent a pointer to that article, with this cocky cover note: "It's a known problem and Microsoft has a fix. So, click on the link below and pay me..."
Not so fast, Jeremy! You forgot Rule #1A: Don't assume the KB is complete. My search started at the KB, and I found the same article early in my troubleshooting adventure. Microsoft Professional Support confirmed that my problem wasn't the same one.
Rule #2: Check your hardware. Flaky memory? Heat buildup? A defective hard disk? An overclocked CPU? BSODs can be caused by any of those hardware flaws, as TechRepublic member support4pc pointed out: "Is it possible that this is a memory pool problem involving the dual processor configuration? I would start with the hardware, checking/changing memory and possibly removing one processor to see if the problem persists." I ran some diagnostics and ruled out memory and CPU problems quickly—the system had been thoroughly burned in and had been working flawlessly for months. A hard drive problem would almost certainly have produced other symptoms. And I dismissed heat as the culprit after running Motherboard Monitor and verifying that CPU and case temperatures were within normal range.
Rule #3: Focus on what's new. TechRepublic member Steve P. asked the right question in this post: "Most likely this is due to driver conflict. What have you loaded in the past few weeks, just before this started?"
Excellent observation, Steve! The average Windows program can't cause a BSOD, because the Hardware Abstraction Layer protects the system from unruly apps. That's not the case with Windows services and device drivers, however, both of which run at a privileged level. On this machine, I had recently upgraded the BIOS, reinstalled Windows 2000 to add ACPI support, and installed new drivers for the ATA-66 IDE controller. Oh, and I had been experimenting with BlackICE Defender and ZoneAlarm, two personal firewall products. Any one of these changes could have been the cause, but the cryptic Stop code didn't give me enough details to identify which one. Which led me to...
Rule #4: When in doubt, ask for help. I turned to Microsoft Platforms Support, where I was lucky enough to hook up with a support pro named Ben Christenbury. I described the problem, he listened, and he agreed to take a look at the User.dmp file that Windows had created the last time my system crashed. It held enough info to track down the problem.
The culprit turned out to be ZoneAlarm's kernel-mode vsdatant.sys driver, a key component of the firewall software. As Ben explained in an e-mail to me, "This bug check means that the calling thread is making a bad pool request. The pool address being freed is already free. This is not allowed, so the machine bug checked." (From years of poring through KB articles, I know that "bug check" is developer-speak for BSOD.)
Why the unfamiliar error message? "This would yield a bug check under [NT] 4.0," Ben explained, "but it would be a different one. This ‘stop 2c [BAD_POOL_CALLER]’ is an enhanced troubleshooting stop added for Windows 2000." An article documenting this particular bug should appear in the KB shortly, and hopefully Zone Labs will have a fix soon. Don't be surprised if you see this error message turn up in a slew of new KB articles before the year is out.
Kudos (and 1,000 TechPoints each) to the two TechRepublic members who provided valuable assistance this week.
Here's Ed's new Challenge
My file associations have been hijacked again! Some rude application developer decided that his program deserved to take over all files with a particular file extension, even though I prefer to use another program for that particular task. I want to undo the damage without having to reinstall software. More importantly, I want your help in finding a way to keep this from happening again. How do you keep pushy programs from taking over file associations? Do you have a favorite registry hack or a third-party utility to manage file associations? Share it with your fellow TechRepublic members. I'll hand over 1,000 TechPoints for every suggestion I use in my next column. If you're ready to tackle this week's Microsoft Challenge, click here.