Disaster Recovery

Communication and testing are key elements in backup efforts

Tips for using backup systems as part of a disaster recovery plan

By Steven Vaughan-Nichols

I can be such an idiot some days. There I was, zooming along doing my usual work when I noticed the awful odor of overheated electronics. One quick sniff and a glance around the office revealed that the power supply had given up the ghost in Trinity, my ancient NT 4 Primary Domain Controller (PDC) and main server. Though my office LAN isn't exactly your enterprise network, the situation made me think about just how important network backups are—and how you must make sure that you don't lose your information.

You see, worse than losing my server was how lazy I've been about my backups. I managed to restore most of my data, but in the meantime, I lost two days doing it. No business can afford that kind of downtime.

With security threats abounding, the danger of your business being out of service is greater than ever. Here are a few tips to ensure that your backup and restore systems are capable of swinging into action at a moment's notice.

Corporate communications
Maybe the most overlooked issue in backup and recovery is simple communication. You should have a procedure for notifying business units about network failure. Everyone in Austin, TX, may know that their server just expired, but the people in Minneapolis may not have a clue. For big failures, you need to make sure that the entire company knows.

The same is true for a restore; everyone who will be affected by it needs to know what was restored successfully and what was lost. Will transactions be lost? Will the local offices need to resynchronize with master information at the data center? You need to keep everyone in the loop. And you need to test these communication procedures to make sure they'll actually work.

Test real recovery scenarios
Okay, so you have your procedure down, but have you tested real-world recovery scenarios? If one site is completely lost, can you restore that site's data at another site? Have you tested it? Yes, it will shoot an entire weekend for the IT staff, but these days, you need to be sure that if a disaster or an attack wipes out entire data centers, your company won't be wiped out at the same time.

Keep hot backups
If you have big-time servers, you need more than just a RAID disk pack; you need to have local hot backup systems that are replicating the data in real time. Again, it's not enough just to have these; you need to test them to make sure they work. On a quiet Saturday night, turn off the switch to your main server and see if the secondary server actually kicks into gear with current data. If it fails, you better replace it now. Your replicated corporate HQ data may still be safe, but if both your primary and hot backup systems are ever down locally, that won't do you a fat lot of good.

Keep local backups
Finally, you should consider mirroring the most current and critical transactional data locally. Reliance on the one big mainframe or server farm at the central office may be cheaper, but you should keep copies of enough data locally to be able to maintain some semblance of normal business in the event of a critical failure.

Never assume that making multiple successive backups is somehow inherently safer than making one master backup and replicating it. I've made this mistake more times than I care to think. Having multiple successive backups will lead to confusion over which data files are truly the most current, which may result in a loss of data if you use the wrong backup. One master copy and many replicas is the way to go.

Test, test, test
I know this is a lot of work. And, in particular, I know that no one will really want to test those procedures. People don't like spending time on backup and restore issues. They want the backup and restore process to magically save them. It doesn't work that way. Unless you want to be an idiot like I was, you need to grit your teeth and test your systems and your procedures. It's the only way to be sure.

This document was published by ZDNet Tech Update on Nov. 29, 2001.


What's your backup strategy?
How does your organization protect its data? Do you run weekly full backups and daily incremental backups? How long should backups be kept? Post a comment to this article and share your opinions.

 
0 comments

Editor's Picks