Most shops do a good job at periodically testing their backup electrical systems such as UPS, batteries, generators, and power distribution units (PDUs), but less so on fire detection and suppression systems. Here are some tips for regularly scheduled inspection and maintenance of these systems.
By Rich Schiesser in conjunction with the Enterprise Computing Institute
If we were to ask typical infrastructure managers to name the major elements of facilities management, they would likely mention common items such as air conditioning, electrical power, and perhaps fire suppression. Some may also mention smoke detection, uninterruptible power supplies, and controlled physical access. Few of them would likely include less common entities such as electrical grounding, vault protection, and static electricity, among others.
Here's a comprehensive list of the major elements of facilities management:
- Air conditioning
- Electrical power
- Static electricity
- Electrical grounding
- Uninterruptible Power Supply (UPS)
- Backup UPS batteries
- Backup generator
- Water detection
- Smoke detection
- Fire suppression
- Facility monitoring with alarms
- Earthquake safeguards
- Safety training
- Supplier management
- Controlled physical access
- Protected vaults
- Physical location
- Classified environment
Temperature and humidity levels should be monitored constantly, either electronically or with recording charts, and reviewed once each shift to detect any unusual trends. Electrical power includes continuous supply at the proper voltage, current, phasing and the conditioning of the power. Conditioning purifies the quality of the electricity for greater reliability. It involves filtering out stray magnetic fields that can induce unwanted inductance, doing the same to stray electric fields that can generate unwanted capacitance, and providing surge suppression to prevent voltage spikes. Static electricity affecting the operation of sensitive equipment can build up in conductive materials such as carpeting, clothing, draperies and other non-insulating fibers. Anti-static devices can be installed to minimize this condition. Proper grounding is required to eliminate outages, and potential human injury, due to short circuits. Another element sometimes overlooked is whether UPS batteries are kept fully charged.
Water and smoke detection are common environmental guards in today's data centers as is fire suppression mechanisms. Facility monitoring systems and their alarms should visible and audible enough to be seen and heard from most any area in the computer room and when noisy equipment such as printers are running at their loudest. Equipment should be anchored and secured to withstand moderate earthquakes. Large mainframes decades ago used to be safely anchored, in part, by the massive plumbing for water-cooled processors and by the huge bus and tag cables, which interconnected the various units. In today's era of fiber optic cables, air-cooled processors and smaller boxes designed for non-raised flooring, this built-in anchoring of equipment is no longer as prevalent.
Emergency preparedness for earthquakes and other natural or man-made disasters should be a basic part of general safety training for all personnel working inside a data center. They should be knowledgeable on emergency powering off, evacuation procedures, first-aid assistance and emergency telephone number. Managing data center suppliers in these matters is also recommended.
Most data centers have acceptable methods of controlling physical access into their machine rooms, but not always for vaults or rooms that store sensitive documents, check stock, or tapes. The physical location of a data center can also be problematic. A basement level may be safe and secure from the outside but be exposed to water leaks and evacuation obstacles, particularly in older buildings. Locating a data center along outside walls of a building can sometimes contribute to sabotage from the outside. Classified environments almost always require data centers to be located as far away from outside walls as possible to safeguard them from outside physical forces such as bombs or projectiles, and from electronic sensing devices.h2>Major physical exposures common to a data denter
Most operations managers do a reasonable job at keeping their data centers up and running. Many shops go for years without a experiencing a major outage specifically caused by the physical environment. But the infrequent nature of these types of outages can often lull managers into a false sense of security and lead them to overlook the risks to which they may be exposed. Here are the most common of these:
- Physical wiring diagrams out-of-date
- Logical equipment configuration diagrams and schematics out-of-date
- Infrequent testing of UPS
- Failure to re-charge UPS batteries
- Failure to test generator and fuel levels
- Lack of preventive maintenance on air conditioning equipment
- Announciator system not tested
- Fire suppression system not recharged
- Emergency power-off system not tested
- Emergency power-off system not documented
- Infrequent testing of backup generator system
- Equipment not properly anchored
- Evacuation procedures not clearly documented
- Circumvention of physical security procedures
- Lack of effective training to appropriate personnel
The older the data center, the greater these exposures become. I have had clients who collectively have experienced at least half of these exposures during the past three years. Many of their data centers were less than ten years old.
Preventative maintenance, testing, inspections or any combination of these should occur at a minimum of once a year. I have worked with some shops who have annual maintenance contracts in place for their physical facilities, including onsite inspections, but choose not to exercise them. Un-tested safeguards, un-inspected equipment, undocumented procedures and un-trained staff are all preventable invitations to disaster.
Tips to improve the facilities management process
There are a number of simple actions that can be taken to improve the facilities management process. Here are some tips:
- Nurture relationships with facilities department.
- Establish relationships with local government inspecting agencies, especially if considering major physical upgrades to the data center.
- Consider use of video cameras to enhance physical security.
- Analyze environmental monitoring reports to identify trends, patterns and relationships.
- Check on effectiveness of water and fire detection and suppression systems.
- Remove all tripping hazards in a computer center.
- Check on earthquake preparedness of data center. (devices anchored down, training of personnel, tie-in to disaster recovery)
Establishing good relationships with key support departments such as the facilities department and local government inspecting agencies can help keep maintenance and expansion plans on schedule. This can also lead to a greater understanding of what the infrastructure group can do to enable both of these agencies to better serve the IT department.
Video cameras have been around for a long time to enhance and streamline physical security. Occasionally overlooked is the quality of the tape, the recording and the playback mechanism to ensure playback is possible. These should all be periodically checked. Another item to check is the environmental recording device. Many of these are quite sophisticated and collect a wealth of data about temperature, humidity, purity of air, hazardous vapors and other environmental measurements. The data is only as valuable as the effort expended to analyze it for trends, patterns and relationships. A reasonably thorough analysis should be done on this type of data quarterly.
In my experience, most shops do a good job at periodically testing their backup electrical systems such as UPS, batteries, generators, and power distribution units (PDUs), but less so on fire detection and suppression systems. This is partly due to the huge capital investment electrical backup systems require, and managers wanting to ensure a return on such a sizable outlay of cash. Maintenance contracts for these systems frequently include inspection and testing, at least at the outset. But this is seldom the case with fire detection and suppression systems. Infrastructure personnel need to be proactive in this regard by insisting on regularly scheduled inspection and maintenance of these systems. This also includes up-to-date evacuation plans.
The Enterprise Computing Institute helps IT professionals solve problems and simplify the management of IT through consulting and training based on the best-selling Enterprise Computing Institute book series.