I've written at length about the difficulties associated with preserving digital data for the long term. To round it all off today, I've condensed a number of tips to keep in mind should you find yourself with the unenviable responsibility for storing data.
1. RAID is not fail-safe
I've said it before, and I'll say it again: RAID is not 100% reliable. Yes, by itself, RAID does dramatically increase the survivability of your data. However, anecdotal evidence points to the fact that simultaneous disk failure is not impossible or even rare.
To enhance your chances with RAID, ensure that you set up your storage cluster with the right mix of hard disk drives selected from different manufacturing batches. Configuring each RAID array with at least one hot-failover drive — available on higher-end RAID controllers — is a minimum in my book.
In addition, it's imperative to assign someone who will check it periodically — preferably as part of their daily routine. It might sound superfluous, but it's very possible for a hard drive in a RAID configuration to fail for a while before anyone notices.
Some RAID drivers come with a front-end that allows you to define alerts or e-mails to be sent in the case of failure. Your mileage might vary though, and it's always good to physically check whenever possible.
And of course, you should have one additional copy of the data at the minimum, be it near-line, off-line, or off-site.
2. You will have to factor in a media reader or interface
One aspect that gets overlooked in regard to storage is the physical reader required to read the storage medium, or, in cases of hard disk drives, the availability of the hardware interfaces to connect to them.
It's not hard to ensure that tape media are properly stored in a humidity controlled environment, or that your SCSI drives are properly stowed away under lock and key. But can you be certain that you'll be able to read the data safely archived in your backup media of choice should the need arise? Is your tape drive or Iomega Zip drive still working?
Retaining the long-term ability to read older media is one of the chief reasons tape drive companies develop multi-year roadmaps for their tape technologies.
3. Archiving data costs money
You should realize from the start that it costs money to store data. Be it near-line hard disk drives that require electricity to power, or inert tape drive cartridges that require humidity controls, there will be a cost. So you must decide what and how much data needs to be archived, and allocate an appropriate budget for it. And management must understand this from the start.
4. You will have to have a plan for defending against internal sabotage
Admittedly, this probably has more to do with the rigorous implementation of data protection practices and policies. Depending on the corporate culture and the decision makers in your company, this might introduce policies that can be easily implemented — or are outright impossible.
The fact is that it's next to impossible to thwart a determined saboteur who really wants to cause the maximum possible damage. But that's no excuse for implementing the barest policies or safeguards.