Outage

Automating shutdown when your OS doesn't support your UPS


In the article, "Protect your computer and your data with a UPS," I made reference to the importance of protecting against file system corruption when the power goes out. If you leave computers running when no one is sitting in front of them, a key component of this protection is an uninterruptible power supply (UPS) that your OS supports. In the event of a power outage, your OS can be shut down gracefully rather than suffering a system crash when the UPS runs out of power, which is just as bad as having no battery backup at all.

Even if you have a UPS that your OS doesn't support, all is not necessarily lost. There are a number of workarounds that you can implement to gain the automated graceful shutdown that normally depends upon power management software that is capable of running on your OS and communicating with your UPS. In fact, power management software for most UPSs requires a client application being installed on every single computer you want protected against crashes due to blackouts, and requires that each system's power management be configured separately. It is possible to put together a centralized power management system that allows you to configure everything in one place.

The most straightforward means of doing this requires only two computers, neither of which needs to be anything special. In fact, if you have a couple of old 450MHz Pentium II systems each with 64MB of RAM and a 4GB hard drive sitting around in a closet gathering dust, and you've considered taking them to a recycler, this may prove the ideal means of reusing them instead:

  1. Power monitor: Set up one computer with a UPS. We'll call it foo for now. Set up another computer without a UPS. We'll call that one bar for now. Have bar ping foo, or otherwise check in with it, on a regular schedule. Maybe once every twenty seconds should be all you need. If possible, configure the system to reboot automatically and resume normal functioning if power returns -- many system BIOSes have a setting for that. That's all it has to do.
  2. Power manager: Now you have foo listening for pings from bar. Choose a reasonable period of time, probably limited by how long your UPSes are likely to last if the power goes out. For instance, if your shortest-lived UPS will last twenty minutes, you probably don't want to exceed about four or five minutes for this time period. If it fails to "hear" back from bar for that period of time, it starts the shutdown process.
  3. Shutdown process: The shutdown process on foo should consist of a number of tasks it must carry out in the event of a power outage. It will, according to your configuration, use some secure network protocol (such as SSH) to contact the systems whose power it is managing and issue commands to terminate services and suspend long-running processes to the hard drive as necessary. Processes that are normally running on the system and in need of special handling may be listed in a custom configuration file on either foo or on each managed machine as needed, depending on your requirements and preferences. If you have off-site failover configured to maintain services in the event of the local network going down, you may want to set up your shutdown process to notify the off-site failover systems that they'll have to pick up the slack. Finally, when everything is settled and done, shutdown procedures on the managed systems should be initiated (e.g., the power manager may issue an init 0 command over that SSH connection to UNIX-like systems on your network). Somewhere in the midst of this -- probably earlier rather than later in the process -- you should probably configure the power manager to email you or otherwise inform you that something is wrong.

This sort of home-brew system is easier with some OSes than others, of course. The various Linux distributions, FreeBSD, NetBSD, OpenBSD, OpenSolaris, and even proprietary UNIX systems are all well-suited to this solution to the power management problem. BSD Unix and Linux systems are among the best-suited to use on the power monitor and power manager systems, because they can be very secure, stable, and cheap systems that are ideal for giving new life to old, otherwise useless hardware.

Remember to secure your systems, however. Your power manager should be configured so that it doesn't even listen for pings from any computer other than the power monitor, for instance. Even though shutting down a UNIX-like system requires root access to initiate shutdown, you should probably disable remote root access even over SSH, and instead use a staged access policy for your power manager so that it first connects using a less-privileged account then, as a likely best option, employs sudo to initiate shutdown procedures (with sudo being limited for the power manager's account to only the specific commands the power manager needs to do its job).

It is often the case that the best tools are the most basic, allowing you to use minimal tools with as little overhead and as few potential single points of failure as possible. Never neglect the simple solutions just because the lure of expensive commercial systems beckons.

About

Chad Perrin is an IT consultant, developer, and freelance professional writer. He holds both Microsoft and CompTIA certifications and is a graduate of two IT industry trade schools.

4 comments
jpr75
jpr75

Although this could work, it is full of potential problems. As someone else mentioned, if the switch dies, you shutdown your network. If the monitored PC dies, ot it's network card dies, you shutdown your network. A bad LAN cable could also do this. If you have nothing else to work with, this could work, but it really should be a last resort.

The Listed 'G MAN'
The Listed 'G MAN'

Could I make a suggestion? With regard to the Ping, what if it was just the switch or set of switches that was faulty and the Ping failed? Would it not be more robust to have the computers connected via a straight through network cable to eliminate this issue?

apotheon
apotheon

If you use a second network card on the manager system and a crossover cable, or a serial connection to connect the two, that would probably eliminate more potential issues with the system. Depending on your network architecture, you could also multihome both systems -- on multiple network segments -- so that the failure of one switch or router would not affect the communications of the two systems. There are a number of ways to address the problem of potential communications failures giving a false positive indicator of power loss. Choose the one that best suits your needs and available resources, of course.

gjohnson
gjohnson

Here is how I set up our system in a W2K domain. I'm sure this would be possible on nix, and non domain systems as well. 1 system with UPS monitoring software/serial connected UPS. Other system on a UPS without monitoring software/serial connected UPS. UPS monitoring software is configured to run a batch file 5 minutes after power failure. Batch file has commands to shutdown all other systems. After 10 minutes, monitoring system shuts down. Simple, efective, no extra network traffic, no extra system to maintain. In a domain environment, servers have to be started in a certain order so no auto startup is needed/wanted.

Editor's Picks