Questions

Server continually restart at 0440 (4:40 am) every morning

+
0 Votes
Locked

Server continually restart at 0440 (4:40 am) every morning

jonathon.johnson
Every morning, one of my backup servers shutdowns (assuming BSOD) and restarts. When I log in that morning, it'll say (after login) "System restarted after stop error: .....". If I report it to microsoft, it'll say its a blue screen due to driver error. I've checked all the drivers already. They're the same version and drivers as another server of the same type.

I've check the event logs everytime, and it's a consistant thing. Sometime around 0436 and 0444 every morning, it mysteriously restarts. There's no scheduled tasks at that time, no major network events at that time, no labor on the server at that time, and nothing I can think of is working.

It's a intermediate (right word?) problem. It doesn't always restart, just some of the time. Usually about 3-6 times a week. Any suggestions?

Here's the log of the event from today:

3/9/2009 7:12:44 AM USER32 Warning None 1076 <server>\<admin name> <server> The reason supplied by user <server>\<admin name> for the last unexpected shutdown of this computer is: System Failure: Stop error
Reason Code: 0x805000f
Bug ID:
Bugcheck String: 0x000000c2 (0x00000044, 0x00002000, 0x80000000, 0x00000000)
Comment: 0x000000c2 (0x00000044, 0x00002000, 0x80000000, 0x00000000)"


From clicking on the link in the event viewer, it says it's a driver error. But as I said, they're all up to date, and the same versions as another 2 servers like it and those other 2 are not doing this.
  • +
    0 Votes
    risques

    set for around that time, I have seen issues with.

    1. Windows update - If I recall it is set to update around 3am by default.

    2. Anti-Virus. Occasionally an update interacts with another similar update.

    3. Email - scheduled send and receive at the same time as a backup.

    4. a backup routine set with incorrect / faulty / parameters.

    5. AutoRestart Element (ASSL) set with trace active.

    +
    0 Votes
    jonathon.johnson

    1. Windows Update - this is turned off. We get our updates pushed to us via WSUS from higher.

    2. Anti-Virus - It appears there is a scan everyday at 0400, but from the history, it doesn't show any actually completeing since Descember 31st...odd. I'm going to try changing the time on that scan to a time I'm here and see what happens.

    3. Email - Server unable to email anything. Wish I knew how and I'd set it to email my phone...scary thought if it gets stuck though.

    4. Backup's - All scheduled backups to this server have completed successfully, though now that you mention it, I notice SQL Server Agent is stopped showing a logon fail. May be attributed to me changing all the admin passwords on friday. But that's an internal issue we can figure out.

    5. AutoRestart - Not sure even what that is honestly, but we never autorestart any servers.


    I've tried running Windows update as well to ensure all the drivers were up to date, but that didn't give me any good results either.

    +
    0 Votes
    jonathon.johnson

    Ran the virus scan twice, and it didn't shut down on me.

    +
    0 Votes
    cmiller5400

    I had a server that would just power off for no reason. For weeks we looked for an answer. Then for some reason I was in the computer room when the UPS it was attached to did a self test. Guess what? The server powered off hard. Swapped out the UPS and it was fine.

    Look at environmental as well. Is the power being interrupted? Cleaner turning on a vacuum on the same line? (it shouldn't be but you never know...)

    +
    0 Votes
    jonathon.johnson

    Unfortunately, I can't be here at 4am to check to see if it does a self test, but throughtout the day while I'm in my office, I hear and see both of them do self tests and none of the servers so much as quiver then...

    +
    0 Votes
    cmiller5400

    To solve a nasty one requires early mornings. Just ask Colin (HAL9000, OH Smeg) about the alarm going off at a building.

    +
    0 Votes
    jonathon.johnson

    I work on a military post. There's no one at this building until 6am everyday...no one. And they aren't going to give me the keys, and I'm more than will to bet it all that my bosses aren't going to be here that early either lol.

    +
    0 Votes
    jhudmon

    Is it connected to a UPS? If you have Network-Shutdown on, you may want to check to see if it's sending a shutdown request to that server. I'm going to stick around to see what the final explanation of this problem is. Please post fix if you find one :D.

    +
    0 Votes
    jonathon.johnson

    No UPS settings have been configured yet. This site has been in place for some time, but previous admins here haven't had the knowledge or know-how to setup a "datacenter". We've got 6 servers, all of which are on various UPS's. But as I said, they were never configured. As a matter of fact, the UPS's console cables were still in sealed bags. I just started working on them, but this issue has been going on for a few months prior to me doing anything with the UPS.

    +
    0 Votes
    jonathon.johnson

    I will post a fix soon as I have one. So far, there's nothing leading me to any more info than I had before....but I'm sure it will come soon.

    +
    0 Votes
    LarryD4

    Is their anything happening at this time?

    Such as
    a backup is starting
    a backup is ending

    Any maintenance at all?

    +
    0 Votes
    jonathon.johnson

    Nothing in particular is happening at that time. At 0400 a virus scan starts, but is well over by the 0440 time this starts. All backups are done earlier in the night and all database jobs are even earlier. All SQL jobs, backup jobs, maintenance jobs, and virus scans are done between 0100 and 0415. Then magically, server powers off completely (not a shutdown, but Off), and restarts.

    +
    0 Votes
    LarryD4

    We have to find out what specific driver is causing the blue screen.

    You stated
    All SQL jobs, backup jobs, maintenance jobs, and virus scans are done between 0100 and 0415.

    It could be that your backup software is trying to release the backup device and its not responding. Eventually 4:40 ish it fails with blue screen because the hardware is failing to repond to the release.
    (Just a guess)

    +
    0 Votes
    jonathon.johnson

    They all complete successfully. Or so they save. This particular server is only backuped up too....it doesn't not get backed up itself. It is the backup webserver and database server. The other jobs that run are a Defrag, disk clean, and the server jobs that involve updateing and calculating some information for the morning that take about 5 minutes. The 3 hour schedule I have from 1 am to 4 am for maintenance is quite large for what little this server does, but I just want to keep it in-line with the other servers jobs that take longer.

    +
    0 Votes
    jonathon.johnson

    After failing to understand this whole debug thing, I did manage to run it with quite a few nasty messages saying something about "symbols"....but in the end it says, "Probably caused by: ntkrnlmp.exe ( nt!CcCanIWrite+8c9 )"

    +
    0 Votes
    LarryD4

    Re: Windows 2003 Stop 0x000000c2 Error


    Hi. I chased a couple of these around a similiar HP box. Two BSOD stop
    errors for two different reasons. Had me caught up for days. The
    resolutions were;

    1. Stop error caused by installation of Symantec Corp Edition V9 and
    Symantec PC Anywhere on the same box. Following installation of PCAW I
    saw numerous stop errors following moderate server load. Solution was
    to unload PCAW.

    2. Stop error caused by pagefile error. Solution was to increase
    physical memory from .5 to 1gb.

    +
    0 Votes
    jagablack

    Jonathon... did you find a solution?
    I also have a IBM server that reboot unexpectedly at 4:40AM daily...

    Thanks,
    jagablack

    +
    0 Votes
    jonathon.johnson

    No, I've not got this fixed yet. Still can't figure out what it is. I just tolerate it and hope it doesn't wipe something...

  • +
    0 Votes
    risques

    set for around that time, I have seen issues with.

    1. Windows update - If I recall it is set to update around 3am by default.

    2. Anti-Virus. Occasionally an update interacts with another similar update.

    3. Email - scheduled send and receive at the same time as a backup.

    4. a backup routine set with incorrect / faulty / parameters.

    5. AutoRestart Element (ASSL) set with trace active.

    +
    0 Votes
    jonathon.johnson

    1. Windows Update - this is turned off. We get our updates pushed to us via WSUS from higher.

    2. Anti-Virus - It appears there is a scan everyday at 0400, but from the history, it doesn't show any actually completeing since Descember 31st...odd. I'm going to try changing the time on that scan to a time I'm here and see what happens.

    3. Email - Server unable to email anything. Wish I knew how and I'd set it to email my phone...scary thought if it gets stuck though.

    4. Backup's - All scheduled backups to this server have completed successfully, though now that you mention it, I notice SQL Server Agent is stopped showing a logon fail. May be attributed to me changing all the admin passwords on friday. But that's an internal issue we can figure out.

    5. AutoRestart - Not sure even what that is honestly, but we never autorestart any servers.


    I've tried running Windows update as well to ensure all the drivers were up to date, but that didn't give me any good results either.

    +
    0 Votes
    jonathon.johnson

    Ran the virus scan twice, and it didn't shut down on me.

    +
    0 Votes
    cmiller5400

    I had a server that would just power off for no reason. For weeks we looked for an answer. Then for some reason I was in the computer room when the UPS it was attached to did a self test. Guess what? The server powered off hard. Swapped out the UPS and it was fine.

    Look at environmental as well. Is the power being interrupted? Cleaner turning on a vacuum on the same line? (it shouldn't be but you never know...)

    +
    0 Votes
    jonathon.johnson

    Unfortunately, I can't be here at 4am to check to see if it does a self test, but throughtout the day while I'm in my office, I hear and see both of them do self tests and none of the servers so much as quiver then...

    +
    0 Votes
    cmiller5400

    To solve a nasty one requires early mornings. Just ask Colin (HAL9000, OH Smeg) about the alarm going off at a building.

    +
    0 Votes
    jonathon.johnson

    I work on a military post. There's no one at this building until 6am everyday...no one. And they aren't going to give me the keys, and I'm more than will to bet it all that my bosses aren't going to be here that early either lol.

    +
    0 Votes
    jhudmon

    Is it connected to a UPS? If you have Network-Shutdown on, you may want to check to see if it's sending a shutdown request to that server. I'm going to stick around to see what the final explanation of this problem is. Please post fix if you find one :D.

    +
    0 Votes
    jonathon.johnson

    No UPS settings have been configured yet. This site has been in place for some time, but previous admins here haven't had the knowledge or know-how to setup a "datacenter". We've got 6 servers, all of which are on various UPS's. But as I said, they were never configured. As a matter of fact, the UPS's console cables were still in sealed bags. I just started working on them, but this issue has been going on for a few months prior to me doing anything with the UPS.

    +
    0 Votes
    jonathon.johnson

    I will post a fix soon as I have one. So far, there's nothing leading me to any more info than I had before....but I'm sure it will come soon.

    +
    0 Votes
    LarryD4

    Is their anything happening at this time?

    Such as
    a backup is starting
    a backup is ending

    Any maintenance at all?

    +
    0 Votes
    jonathon.johnson

    Nothing in particular is happening at that time. At 0400 a virus scan starts, but is well over by the 0440 time this starts. All backups are done earlier in the night and all database jobs are even earlier. All SQL jobs, backup jobs, maintenance jobs, and virus scans are done between 0100 and 0415. Then magically, server powers off completely (not a shutdown, but Off), and restarts.

    +
    0 Votes
    LarryD4

    We have to find out what specific driver is causing the blue screen.

    You stated
    All SQL jobs, backup jobs, maintenance jobs, and virus scans are done between 0100 and 0415.

    It could be that your backup software is trying to release the backup device and its not responding. Eventually 4:40 ish it fails with blue screen because the hardware is failing to repond to the release.
    (Just a guess)

    +
    0 Votes
    jonathon.johnson

    They all complete successfully. Or so they save. This particular server is only backuped up too....it doesn't not get backed up itself. It is the backup webserver and database server. The other jobs that run are a Defrag, disk clean, and the server jobs that involve updateing and calculating some information for the morning that take about 5 minutes. The 3 hour schedule I have from 1 am to 4 am for maintenance is quite large for what little this server does, but I just want to keep it in-line with the other servers jobs that take longer.

    +
    0 Votes
    jonathon.johnson

    After failing to understand this whole debug thing, I did manage to run it with quite a few nasty messages saying something about "symbols"....but in the end it says, "Probably caused by: ntkrnlmp.exe ( nt!CcCanIWrite+8c9 )"

    +
    0 Votes
    LarryD4

    Re: Windows 2003 Stop 0x000000c2 Error


    Hi. I chased a couple of these around a similiar HP box. Two BSOD stop
    errors for two different reasons. Had me caught up for days. The
    resolutions were;

    1. Stop error caused by installation of Symantec Corp Edition V9 and
    Symantec PC Anywhere on the same box. Following installation of PCAW I
    saw numerous stop errors following moderate server load. Solution was
    to unload PCAW.

    2. Stop error caused by pagefile error. Solution was to increase
    physical memory from .5 to 1gb.

    +
    0 Votes
    jagablack

    Jonathon... did you find a solution?
    I also have a IBM server that reboot unexpectedly at 4:40AM daily...

    Thanks,
    jagablack

    +
    0 Votes
    jonathon.johnson

    No, I've not got this fixed yet. Still can't figure out what it is. I just tolerate it and hope it doesn't wipe something...