Automating shutdown when your OS doesn't support your UPS

In the article, "Protect your computer and your data with a UPS," I made reference to the importance of protecting against file system corruption when the power goes out. If you leave computers running when no one is sitting in front of them, a key component of this protection is an uninterruptible power supply (UPS) that your OS supports. In the event of a power outage, your OS can be shut down gracefully rather than suffering a system crash when the UPS runs out of power, which is just as bad as having no battery backup at all.

Even if you have a UPS that your OS doesn't support, all is not necessarily lost. There are a number of workarounds that you can implement to gain the automated graceful shutdown that normally depends upon power management software that is capable of running on your OS and communicating with your UPS. In fact, power management software for most UPSs requires a client application being installed on every single computer you want protected against crashes due to blackouts, and requires that each system's power management be configured separately. It is possible to put together a centralized power management system that allows you to configure everything in one place.

The most straightforward means of doing this requires only two computers, neither of which needs to be anything special. In fact, if you have a couple of old 450MHz Pentium II systems each with 64MB of RAM and a 4GB hard drive sitting around in a closet gathering dust, and you've considered taking them to a recycler, this may prove the ideal means of reusing them instead:

  1. Power monitor: Set up one computer with a UPS. We'll call it foo for now. Set up another computer without a UPS. We'll call that one bar for now. Have bar ping foo, or otherwise check in with it, on a regular schedule. Maybe once every twenty seconds should be all you need. If possible, configure the system to reboot automatically and resume normal functioning if power returns -- many system BIOSes have a setting for that. That's all it has to do.
  2. Power manager: Now you have foo listening for pings from bar. Choose a reasonable period of time, probably limited by how long your UPSes are likely to last if the power goes out. For instance, if your shortest-lived UPS will last twenty minutes, you probably don't want to exceed about four or five minutes for this time period. If it fails to "hear" back from bar for that period of time, it starts the shutdown process.
  3. Shutdown process: The shutdown process on foo should consist of a number of tasks it must carry out in the event of a power outage. It will, according to your configuration, use some secure network protocol (such as SSH) to contact the systems whose power it is managing and issue commands to terminate services and suspend long-running processes to the hard drive as necessary. Processes that are normally running on the system and in need of special handling may be listed in a custom configuration file on either foo or on each managed machine as needed, depending on your requirements and preferences. If you have off-site failover configured to maintain services in the event of the local network going down, you may want to set up your shutdown process to notify the off-site failover systems that they'll have to pick up the slack. Finally, when everything is settled and done, shutdown procedures on the managed systems should be initiated (e.g., the power manager may issue an init 0 command over that SSH connection to UNIX-like systems on your network). Somewhere in the midst of this -- probably earlier rather than later in the process -- you should probably configure the power manager to email you or otherwise inform you that something is wrong.

This sort of home-brew system is easier with some OSes than others, of course. The various Linux distributions, FreeBSD, NetBSD, OpenBSD, OpenSolaris, and even proprietary UNIX systems are all well-suited to this solution to the power management problem. BSD Unix and Linux systems are among the best-suited to use on the power monitor and power manager systems, because they can be very secure, stable, and cheap systems that are ideal for giving new life to old, otherwise useless hardware.

Remember to secure your systems, however. Your power manager should be configured so that it doesn't even listen for pings from any computer other than the power monitor, for instance. Even though shutting down a UNIX-like system requires root access to initiate shutdown, you should probably disable remote root access even over SSH, and instead use a staged access policy for your power manager so that it first connects using a less-privileged account then, as a likely best option, employs sudo to initiate shutdown procedures (with sudo being limited for the power manager's account to only the specific commands the power manager needs to do its job).

It is often the case that the best tools are the most basic, allowing you to use minimal tools with as little overhead and as few potential single points of failure as possible. Never neglect the simple solutions just because the lure of expensive commercial systems beckons.