Disaster Recovery

10 common backup mistakes

Despite their best intentions, IT pros sometimes fall short when it comes to implementing a reliable backup solution. See whether any of these mistakes sound familiar.

Despite their best intentions, IT pros sometimes fall short when it comes to implementing a reliable backup solution. See whether any of these mistakes sound familiar.


All of us in IT have been taught from Day One that performing regular backups is critical to an organization's well being. Yet even seasoned pros sometimes make certain mistakes. Here are a few of the common mistakes I've encountered.

Note: This article is also available as a PDF download.

1: Not making system state backups often enough

In Windows environments, system state backups have a shelf life. For domain controllers, the shelf life is equal to the maximum tombstone age (60 days by default). After that, the backup becomes null and void. Even for non domain controllers, the age of the backup is an issue.

Each computer on a Windows network has a corresponding computer account in the Active Directory. Like a user account, the computer account has an associated password. The difference is that the password is assigned, and periodically changed, by Windows. If you try to restore a system state backup that is too old, the computer account password that is stored in the backup will no longer match the password that is bound to the computer account in the Active Directory, so the machine won't be able to participate in the domain. There are workarounds, but it is usually easier to just make frequent system state backups of your servers.

2: Failing to adequately test backups

We all know that we should test our backups once in a while, but testing often seems to be one of those tasks that either falls by the wayside or that isn't done thoroughly. Remember that making the backup is only the first step -if you can't restore from them, you're dead in the water. You need to ensure that those backups will work if and when you need them.

3: Not using an application-aware backup application

For some applications, a file-level backup is insufficient. A classic example of this is Microsoft Exchange, which requires an Exchange-aware backup application. Failure to use such a backup application causes the data that has been backed up to be in an inconsistent (and often unrestorable) state. It is therefore important to know which applications reside on your servers and to make note of any special application-specific backup requirements.

4: Shipping backup tapes offsite too quickly

One of the companies I used to work for used a courier service to ship backup tapes offsite. Each morning at 8:00, the previous night's backup tapes were transported to an offsite storage facility. One morning, we had a server failure at about 9:30. Unfortunately, we couldn't perform an immediate restoration because the tape had been shipped offsite. It was almost 4:00 before the tape could be located and returned to us. By that time, the server had been down all day. While you should keep backups off site, consider waiting until the end of the business day to remove the previous night's tapes from the building.

5: Having a single point of failure

Always remember that your backups are your safety net. If a server fails, your backups are the primary (and sometimes the only) mechanism for returning the server to a functional state. Because backups are so critically important, you should construct your backup architecture in a way that avoids (at least as much as possible) having a single point of failure. If possible, have a backup for your backups. You never want to find yourself in a situation in which you did not get a backup the night before and you are just praying that the server doesn't fail that day because you have nothing to fall back on.

6: Forgetting to plan for the future

Years ago, I managed the IT department for a large organization. While I was on vacation, some of my staff decided to surprise me by cleaning out a storage room that had become badly cluttered. In doing so, they threw out some obsolete computer equipment.

While this initially seemed harmless, some of the old equipment was in the storage room for a reason. Each quarter, the organization made a special backup that was kept as a permanent archive. Over time, though, backup technology changed. Although the company had decided at one point to switch to a newer tape format, I had kept the organization's old tape drives and an old computer that had a copy of the backup software installed on it, just in case we should ever have to read any of the data from an archive tape. The lesson to be learned is that although change is inevitable, you should always make sure that you have the necessary hardware and software to read your oldest backup tapes.

7: Not considering the consequences of using backup security mechanisms

For most organizations, IT security is a high priority. But sometimes, security can be a bad thing. I have seen real-world situations in which a backup could not be restored because nobody knew the password that had been used on the backup tape. I also once saw a situation in which an organization used hardware-level encryption and then upgraded to a new tape drive that didn't support the previously used encryption (which meant that old backups could not be restored).

There is no denying that it is important to secure your backups, but it is equally important to consider the consequences of your security measures. If you find yourself having to restore a backup after a major system failure, the last thing you need is an ill-conceived security mechanism standing in the way of the recovery.

8: Backing up only data

I once had someone tell me that I should be backing up only my data, as opposed to performing full backups that included the server's operating system and applications. His rationale was that data-only backups complete more quickly and consume fewer tapes. While these are valid points, I completely disagree with the overall philosophy.

If an organization has a server failure and needs to perform a full recovery, it is usually possible to reinstall the operating system and the applications and then restore any data. However, time is of the essence when trying to recover from a crash. It is much faster to restore everything from backup than it is to manually install an operating system and a set of applications. More important, it is often difficult to manually configure a server so that it matches its pervious configuration. Backing up the entire server ensures that its configuration will be exactly as it was before the crash.

9: Relying solely on a disk-to-disk backup solution

Disk-to-disk backup solutions offer many advantages over traditional tape backups. Even so, a disk-to-disk backup solution should not be an organization's only backup, because the backup server is prone to the same risks as the servers it protects. A hurricane, lightning strike, fire, or flood could wipe out your backup server along with your other servers. For this reason, it is important to dump the contents of your disk based backups to tape on a frequent basis.

10: Using a tape rotation scheme that's too short

One organization I worked for used a two-week tape rotation. This seemed to work fairly well, but we found out the hard way that two weeks just weren't enough. The organization had an Exchange server fail because of corruption within the information store. When we tried to restore a backup, we found that we had backed up corrupt data. The corruption had existed for some time and had grown progressively worse. Every one of the backup tapes contained corrupt data, so the server could not be restored. This is a perfect argument for periodically testing your backups, but it also underscores the importance of using a long rotation scheme or at least keeping some of your backup tapes as long-term archives.

About

Brien Posey is a seven-time Microsoft MVP. He has written thousands of articles and written or contributed to dozens of books on a variety of IT subjects.

39 comments
husain.al-habib
husain.al-habib

I will suggest also that backup admin should consult the application/ Database administrators to decide what to backup, how to run the backup and for how long the backup should be protected. Some application or Databases like exchange, MS-SQL & Oracle DB can be integrated with the backup applications via API, so the backup administrator should coordinate with the exchange administrators or the DBA?s to configure the backups. Some applications are configured with special databases that can?t be integrated with Backup applications directly and the developers provide special tools to backup the DB to the HD, so the backup admin should coordinate with the application admin to automate the process of backing up database to the HD and then copy the resulted file to tape. The application admin should know the business requirement, for example for how long they should keep the data on the Database or on which days of the month they the run DB purge process is to delete the old records, based on that he can decide for how long to keep the daily backup & and backup before the purge process. There are many changes to be addressed by the backup administrator especially on the enterprise environment.

taylorstan
taylorstan

You missed the all important mistake of all. MONEY!!!! A company I started working for had basicly a usb disk for it's back-up of just the data on a nas. Well that solution wasn't working as i showed them through their logs. I suggested a modist 2week back-up system, noting over the top, but they said find a solution for half the price. That droped it down to a 3day back-up system. Well how much does it cost to lose information and time???? I would say probably more than 10X the other half of the cost of the system.

jmarkovic32
jmarkovic32

I had a disk to disk backup solution using Symantec Endpoint Protection 12. I backed up my NAS device on a volume on the NAS device itself and then offloaded that backup to tape. For extra protection, I encrypted the tape. One fateful day, the NAS device crashed with an OS error. Of course, I wasn't too flustered since I had a tape backup of the data and so it was a minor inconvenience. Or so I thought. Turns out that there was a glitch in SEP 12 that if the source backup device was unavailable, you could not restore data from an encrypted tape. I was screwed. Miraculously though, one server on my network kept its mapping to the NAS and although the permissions were screwed up I was able to copy most of my critical data from the NAS including our financial data! The whole three days that this happened I suffered mentally and almost quit my job due to the enormous stress. From then on, I immediately requested that we move to a cloud-based backup solution. It covers all of our bases.

psmithphil
psmithphil

Great article. I'm going to take some of these tips to heart on my own personal backups as well. Especially the one on keeping my backup rotation too short - I need to lengthen it.

ericfraga
ericfraga

Easy and simple: True Image on every server, daily images (incremental) of every partition on every server to a PC with good, large 1.0, 1.5 TB HDDs. Then, on friday, copy all .TIBs to external HDD (1.0, 1.5 TB HDDs on USB cases) and repeat the process on next friday. You'll end up with EVERYTHING backed up on TWO different places, and the USB HDD will be durable as rock because it's working only a couple hours a week. No tapes, no discs - only good hard disks and everything doubled.

reisen55
reisen55

I read this list with interest and realized that my procedures and procotols for my clients negate every single one of these fault points. Inclusive of testing which I do regularly. I admit to bias here since I learned the hard way during a server crash in September, 2001 when my servers crashed all the way from the 103rd floor in the South Tower of the World Trade Center to somewhere below street level. This experience, and it was 1,800 stairs down for me ... I walked 'em ... makes it a natural inclination to have secure procedures that are redundant, tested and cover every eventuality I can consider. Or, in a trite phrase, BEEN THERE, DONE THAT.

eclypse
eclypse

We use TSM, so you never ship your primary storage tapes (or disks) off-site, only a copy of them along with your TSM database. If you use another solution, it will hopefully support copy storage pools (or something similar) so that your primary pools are always available for restores.

sethbburgess
sethbburgess

I have struggled with various backup technologies over the years in my small business. I have used a variety of tape backup drives and found every one of them cumbersome, unreliable, and extremely time consuming to use. I have settled on two USB portable 1TB hard drives storing weekly disk images for all my machines. The back up time is relatively quick and the archives can be mounted as a disk drive and explored very quickly compared to tape. I do daily data file backups on the same drives and rotate them off site one at a time in case of fire or other problems. To restore a system I simply restore the latest disk image and copy the latest data. I have done this several times over the years without any unrecoverable failures. (keeping my fingers crossed, nothing is perfect.) Seth

handyman1972
handyman1972

Unfortunately, I think the non-tech folks tend to look at backup costs much as they might look at car insurance. "What's the cheapest thing I can pay for to get by?" depending on the situation, I'd say your 10X assessment could be very optimistic. I've thought of creating a "fire drill" scenario to help those in charge of the purse strings better understand the costs of not being prepared.

monicabower
monicabower

...using an online service to save money. Because unless you have so little data that you could handle it on your own USB drives, it costs a fortune for endpoints, per gig, per month, ad infinitum until that service gets bought by someone or goes out of business and god knows what happens to your data. One of the reasons I'm in the industry is because of incidents like this in the past. A company that only values money in the bank today may find no money there tomorrow. Do these guys skimp on their own health and life insurance as well to save a buck?

eclypse
eclypse

"From then on, I immediately requested that we move to a cloud-based backup solution. It covers all of our bases." How long does it take to restore your data from "the cloud" and what happens when you can't get to "the cloud"? Who is responsible for "the cloud"?

handyman1972
handyman1972

One thing I found you have to watch out for is specifically what file system your NAS uses. I had a Buffalo NAS that acted as the dumping spot for all of our backups. This was a networked Buffalo with AD integration. Turns out the Buffalo converts everything it stores to an XFS (Linux) file format. Well, one fateful day I had on OS failure on the Buffalo and could not access anything on the drive. I even pulled the physical drive out of the Buffalo housing and tried to hook it up through a USB drive adapter to our server and an XP workstation. Windows just doesn't know what to do with XFS files. There are several workarounds, but some are risky, and they are all time consuming and a pain in the backside. I was ultimately able to get my files back, but it was a 19 hour process to recover 385GB and get it transferred to a new NAS in NTFS format. When purchasing external drives for a Windows environment, you may want to make sure that your external drives retain the NTFS file format, or at least FAT32. If the built-in front end interface takes a dive, you're recovery will be much easier.

rwilson
rwilson

What if on Thursday night your builing burns, get destroyed by a tornado. The only data that you have is the old Friday & you have lost the work from the current week. Is that correct or something else that I need to look at? Currently we do a online backup & a backup to a usb harddrive everynight so the most we should lose would be one day work....but our usb drive cases seem to fail often & the online seems to be taking longer & longer being from a rural area we don't have a very fast connection. Any ideas?

ITAuditGuy
ITAuditGuy

My thought exactly. Because, at 9.30 one could have a major disaster where the server(s) and the backup tape(s) are all destroy. So it is best to send a second copy off at the earliest time.

ddraja
ddraja

There is many topic to our commenting .IT sector is one of them. attorneys--attorneys

monicabower
monicabower

Taking a look at a 3X Remote Backup Appliance? If you're pushing around between 100GB and 2TB of live data, anyway. And also - why not restore the old media to a new media format so you don't have to hang on to all the old tapes and tape equipment, which will eventually quit working anyway? We just upgraded all the hardware inside the 3X portable appliances to have 1.2 million hours MTBF - that works out to 137 years of constant use. And no monthly fees so you're not completely hosed if you decide to do something else for backup before your 137 years are up (though chances are pretty good we'll deliver an update before that happens). Third parties are only as reliable as their least trustworthy employee, after all, so there's a market for a high-end appliance you keep and control. No hard sell though. 3x Systems at http://www.3X.com and @3xsystems on twitter if you want to look us up. We're a NetworkWorld product of the week this week too.

MyopicOne
MyopicOne

Unless you implied it elsewhere, of course, which is to Monitor. MonitorMonitorMonitor...Make sure the jobs run successfully. As head of corporate DBAs at my old employer, my group checked the most critical database backups (to disk) daily. But the network folks tasked to review the disk-to-disk backup logs simply could not be trusted to look at them and alert others if something important failed - they had "too much other work". Talked to and met with their managers repeatedly - they proved totally unresponsive and worthless. So I had the logs shipped to me in addition and for eight months spent five whole minutes a day reviewing the logs - until I gave up in frustration. Yeah, I'm still annoyed about it... Because six out of eight days a critical database backup (and usually SAP Production) failed to be copied by the disk-to-disk backup system, rendering the entire system vulnerable to a server disk crash. Since that group's management was unresponsive, we created and ran a nightly batch job to copy the @%$%@^ backup files to another server. Wasteful of disk space and other IT resources? Yup. Too damn bad, since 1) my group was doing the responsible group's job by even looking at the logs and 2) that group couldn't be bothered to a) fix the problem or b) be counted on to look at the logs and notify us when something failed, so we could take precautions.

chrisflusche
chrisflusche

I'm doing almost EXACTLY the same thing, and it works beautifully! Using Acronis True Image w/ Universal Restore, I can recover any failed server within about 30 minutes on completely different hardware. That is flexibility and power! Tapes can stay in 1989 where they belong!

ssampier
ssampier

Everything in i/t is an expense. I try to steer the conversation toward saving money or providing more services. Invariably that dark mark of dollars and cents creeps up. This is an example in education. Maybe other sectors aren't as cost-adverse, but somehow, I doubt it. P.S. Oddly I am reminded of those terrible Iomega Zip drive advertisements, "You have insurance if your teen driver crashes your car. Iomega Zip is crash protection for your PC."

monicabower
monicabower

I'd be interested in using it too :)

reisen55
reisen55

I discussed this very issue with a certified BCP/DR consultant I breakfast with once a month, generally to discuss security issues and become horribly depressed. An associate of his lost a laptop, but has his data on good old MOZY. He obtained a replacement and began to transfer data down from good old MOZY and ... 27 hours later ... he had it all back. And that was one system only. 1.5 TB Hard drive at Microcenter is about $114 these days. It's cheaper and faster. The cloud is internet connection to another server somewhere else. NOTHING IS STORED IN THE CLOUD PER SE. The Cloud is merely an immensely long long long cable connect thorugh the web from your computer to a server somewhere else. Owned by somebody else too. Accessed by somebody else too. And can be hacked. I have no faith, ZERO, in cloud based backup. You want secure? USB hard drive backup periodically, remove and set drive on a shelf in your office. I would like to see the internet get to that device!!!!!!

eclypse
eclypse

Seems like the obvious thing to do here is to install Linux on any random hardware you have laying around and put the drive in there, configure Samba so you can connect from a Windows box - no risk, not time consuming, problem solved. Just appears from reading the post that you knew that you had a Linux filesystem and did everything except install an OS that could read the filesystem...

reisen55
reisen55

YOU have just outlined the major fault of backup protocols, and that is OFF-LINE or OFF-CRISIS testing of your backups. Plus let us consider restoration of SERVERS and not just data!!! A consultant I work with actually had the pleasure of doing that for the Girl Scouts of America at the IBM STERLING FOREST, NY facility last year. Total reconstruct of the data center from server build to data restore. 11 hours of hard work, documented and faults noted. But ... hey, the tapes are in are a secure place. And they always work, don't they?

handyman1972
handyman1972

You raise a good point about possibly losing a week's worth of data. I create daily disk images of every partition on 6 servers (12 images, about 330GB total), that get stored to a 500GB seperate physical shared drive inside the domain controller. Then at night, these images are copied to an external USB drive. I have 5 of these drives, rotated for each day of the week, and each drive is large enough to store two days worth of backups. Thus I have two weeks worth of images to restore from and help guard against backups that may contain corrupted data. This scenario also allows me to have the most recent backups accessible on the drive contained within the PDC. The imaging software automatically deletes any image files older than 13 days from the external USB drives. All I have to do is make sure the drives get swapped each morning, the other 4 are kept off site (in a padded and padlocked steel ammo box bolted inside the trunk of my car), and that I check the backup logs each morning, which takes all of 5 minutes. I have used this scenario to migrate data on two servers to larger hard drives with complete success, and the process only took about 30 to 40 minutes. Lastly, I use 2TB external drives to do monthly archive backups that I keep permanantly. Each drive can hold 6 monthly backups with medium compression. Worst case scenario I lose one day's worth of data, which isn't optimal, but the powers that be aren't willing to spend the big bucks to get something closer to real-time backup. If I had my preference, I'd be getting better upstream bandwidth and using realtime backup software to transfer to an offsite server/NAS. What brand of USB drives are you using that you would have so many issues with them? I am using WD and SimpleTech branded drives and they have worked flawlessly for over two years.

pete.irvine
pete.irvine

Put a wireless link across to another premises owned by someone friendly and stick a NAS box on a UPS there. This can be in a cupboard or big drawer. We use this method for rural people in New Zealand and it works well.

Greeboid
Greeboid

I couldn't agree more. That's one of the reasons why there are tape mirroring solutions out there! I've worked in too many organisations that have made some, if not all of the ten mistakes made here.

pjboyles
pjboyles

None of them could setup an automated log check and notification? That sounds very bad for an enterprise. One script to check all logs that logs its actions. One log for manual review to see if there are issues with the log checking. Daily 5 minutes of work in validating all of your logs.

Forum Surfer
Forum Surfer

That may work great in a small business, but what happens when you have dozens and dozens of servers, several of which are virtual with even more servers contained within? Tapes are very much still a part of a modern, up to date disaster recovery plan...definitely not 1989 tech. We use several methods, including disk to disk and disk to tape to ensure full, reliable backups at different levels.

monicabower
monicabower

that anything you leave up to people to do themselves, they won't do. Manual backup involves a lot of hope and assumptions which are usually dangerous enough when dealing with live data. If you're interested in a portable backup system that's policy based and autonomous rather than manual, you can find it at a good price for small and medium sized business at 3x.com. It will also make a backup of the backup to USB so you can have that secure USB drive on the shelf whenever you want. They offer free 30 day trials and live webex demos every day so there's no reason not to at least give it a look if you're in the market for a solution to the problem. It's simple, you don't need to spend time fiddling with it or thinking about it, and it gives you a secure backup without needing hope and/or assumptions for it to work when you need to get data back ;) And it will backup all your laptops out in the field too, so if your remote people lose their laptops AND their usb backups you're still in business.

handyman1972
handyman1972

Your solution would have worked, of course. The problem is that I run a one-man IT department for a smaller company, and I generally don't have extra machines lying around. I could have set up a dual boot on my PC, but I couldn't have it tied up for so long transferring almost 400GB of data.

handyman1972
handyman1972

Why all the animosity? I never said I didn't regularly test the backups (disc images of both OS and data partitions of the 6 servers, so yes, SERVER/PDC/AD restoration and not just data, as you put it). I did on numerous occasions, and they worked fine, so long as the front end of the NAS was functional. The problem only arose when the integrated OS of the NAS took a vacation. Nowhere in the documentation or interface of the Buffalo did it state that all data was converted to XFS. Perhaps it's my fault for not digging through the bowels of Buffalo's online forums to find this information prior to purchasing, but I would think the file conversion an important enough technical detail that Buffalo should clearly provide that info.

dave
dave

Tried a 1TB Seagate USB drive recently and had a lot of troubles. Can't say about the SimpleTech, but I have one at home that's reliable. I still use tape for offsite, but USB drives are hard to beat for price and reliability, especially if your on a tight budget.

CListo
CListo

What about the new trend of backing up using iSCSI or web based storage? As of now, we are a small company but with HUGE NAS and tons of windows backup files (flame suit on)

MyopicOne
MyopicOne

That's my theory - this should have been a several page report that could be glanced over in minutes and filed. But the answer ultimately was no... That group tried to implement MOM but failed on so many levels it became laughable. Server crashed after three days due to the volume of messages, alerts, etc., nobody monitored the MOM server/database/etc, never got around to adding the monitoring on my 500+ SQL Server databases, just a complete fiasco. Had to send them my requirements three times. But in their defense (and as an outsider), they only were given half the time and resources they asked for, and my group wasn't allowed to be on the project. Yes, very poor management.

Forum Surfer
Forum Surfer

Factor that in with single tape drives and I can see where the frustration would come from. But with LTO3's or 4's, it really isn't that big of a deal even with a small tape library. I've never had issues with tape backups here, other than "administrator induced issues". :)

green yamo
green yamo

I feel like tape gets a bad reputation based on just a few products. I backup about 70 TB of data with backup exec, a Qualstar tape library, and IBM LTO3 drives. I've never had a problem even when we had a 2 TB raid fail and had to restore the whole thing. Are people's negative opinions based on stories about AIT4? We had some issues with that format but LTO3 has been solid (at least with IBM and HP drives - the other LTO manufacturer , Cer***ce, had some problematic drives.)

eclypse
eclypse

We use DataDomain units which do what your Exagrid units do. They save me so much time and headache over strictly using tape!!!!!! However, I keep my primary pools on the DataDomain and since I have all that extra tape left over, I still use them for a copy storage pool - just in case. =)

travis.duffy
travis.duffy

Even stored properly, we have had too many problems to them. We backup to an Exagrid disk array and it does data deduplication and replication with another Exagrid disk array at our disaster recovery site. A good tape free backup stragegy.

Sepius
Sepius

I still find storing tapes to be better than portable drives. The recoveries I have done have all worked without a hitch, as long as the storage is right, not next to the monitor! or in a box on top of the microwave on top of the fridge!. I have however had issues making backups (HP drive and Server2003 has been the worst) I won one client when the competition advised 10 usb drives as a backup solution. I supplied a drive and 40 tapes that came in cheaper, and the procedure was easier for the client, giving me piece of mind. Be careful, apparent ease for us may not be better with a catastrophic event.

Editor's Picks