Disaster Recovery

Data storage: The word of the day is zettabyte

Current IT strategies can manage today's digital universe, but they won't be up to the challenge of solving the problems that will face the next generation of IT managers.

First, a vocabulary lesson: a zettabyte is a unit equal to one billion terabytes. It seems like an unfathomable size, but IDC predicted in a recent study that by the end of the decade, our digital universe will be 44 times larger than it is now, with approximately 35 zettabytes of virtual data needing management by 2020. While current IT strategies can manage today's digital universe, many of these practices and solutions won't be up to the challenge of solving the problems that will face the next generation of IT managers -- problems that include data backup and disaster recovery. Sooner than we might like, the current rules and solutions will no longer suffice.

We're seeing a trend today in which companies look to storage virtualization to provide increased space for growing data loads. It's a step in the right direction, but simply consolidating data into an expandable, virtual pool of storage is not enough. Organizations need to automate all aspects of management related to data protection -- including backup, recovery, replication, and snapshots -- in order to be fully prepared in the event data restore and disaster recovery solutions need to be deployed.

You can take several steps now to best prepare for this expansion of data, to adapt your current backup and disaster recovery (DR) plans to fit future needs, and to guarantee yourself continuous access to mission-critical data. These steps include:

1. Adopt CDP. It's a game-changer.

Backup and DR, equally important for business survival, have historically been treated as two separate islands of IT handled by different departments. Continuous data protection (CDP) technology changes the tried-and-true rules of traditional backup and DR by bridging the gap between these two methodologies -- simplifying overall operations, removing the obstacles of tape media, and helping organizations get the most value from their storage investments. The fundamental benefit of true CDP is the preservation of revenue-generating, or revenue-enabling, business applications. The promise of CDP is the continuous availability of business applications, regardless of any type of failure.

2. Tape-based backup is going the way of the dinosaurs.

As data volumes have grown and dependence on data availability and fast recovery has increased, the limitations of tape media have become all too apparent. And as we rocket forward toward the era of zettabytes of data, tape will become a true dinosaur. We can see the initial causes of tape's inevitable extinction in today's data center: recovery from tape is time-consuming and imprecise. It can exceed 24 hours, and restoring an application to full operation can take even longer. This makes it difficult for companies to meet recovery point objectives (RPO) and service level agreements (SLA). The lengthy recovery time with tape is clearly acknowledged by the major backup software vendors who offer some type of disk-to-disk option integrated into the legacy tape backup paradigm. When tapes are stored off site, as is often the case, it can take several days to retrieve them and restore data. The revenue and productivity lost during that period can be devastating to a business.

3. Vendor lock-in is bad news. Avoid it.

In an effort to speed data recovery, some organizations have opted to replicate to remote DR sites via array-based disk mirror or snapshot volumes. Array-based snapshots improve recovery points by retaining more frequent data changes between backups. However, because most disk arrays do not communicate across vendor barriers, array-based replication requires a dedicated, homogenous infrastructure at both primary and DR sites -- a major drawback. In addition to creating expensive vendor lock-in, array-based replication consumes a great deal of network bandwidth.

Furthermore, failover processes between sites can be complex and error-prone. Failback processes are often insufficiently planned and/or go untested. For these reasons, replication is usually limited to tier-one applications and storage, while tier-two and tier-three storage applications remain unprotected and prone to loss or corruption of important data. Lastly, the integrity of array-based snapshot recovery points is usually limited to the last good snapshot, which could be several hours behind. The problems with this approach will only become more magnified as data storage demands increase.

4. Dump the duplicated data.

In its "The Digital Universe Decade" report, IDC said, "nearly 75 percent of our digital world is a copy -- in other words, only 25 percent is unique." While some of that duplication is necessary for regulatory and legal reasons, most of it is a flat-out waste that can be eliminated with deduplication-capable backup solutions. Data deduplication addresses the pains related to traditional backup methods and optimizes capacity utilization behind backup applications to extend the retention of data online.

5. Combine local and remote protection.

By approaching backup and DR processes holistically, you gain guarantees on your recovery time objective (RTO) and recovery point objectives (RPO). A consolidation platform of data-protection services gives you the flexibility to apply the right protection method to the right set of data, application, system, or even a data center. Whether it's application-aware snapshots, synchronous replication mirroring, or even data journaling, the mechanism depends on the value of the data and its availability requirement.

The same technologies that are being embraced today as methods to speed data recovery and reduce downtime -- namely CDP and deduplication -- will be the basis for future strategies to bridge the widening gap between data demands and storage availability. Enterprises that move toward these technologies today will gain tangible advantages and will be far more prepared to capitalize on future opportunities than those enterprises that cling to existing, inadequate solutions that will inevitably become a disadvantage within the next decade.

(For more on zettabytes, see "Goodbye Petabytes and Exabytes, Hello Zettabytes!")

Bobby Crouch is the product marketing manager at FalconStor Software. He is a 20-year technology industry veteran with roles ranging from development engineering to sales and marketing. Bobby's expertise extends from microprocessor architecture, servers, networking, enterprise software, and storage.

7 comments
jstuart8
jstuart8

When will the OSI (SI) add units beyond YottaBytes?

zackers
zackers

You ignore that fact that most of those zettabytes are archived data that is in little day-to-day demand. It makes no sense to store that data on expensive low-access time devices. Without a new storage technology, such as really cheap SSD or memristors, it's hard to see tape going away. Sure, it can take days to get back something stored away on tape, but you should never have data needed immediately stored away in an remote vault to begin with. There's a cost to keeping all your data on disks somewhere, even if they're not spinning. Drives are mechanical, and can fail just sitting on a shelf just as tape can. But drives are still more expensive per byte than tape, and for archive purposes tape is still the better choice when all costs are considered.

ktaisia
ktaisia

The TAPE will return. "Return of the TaPe"

coolmark82
coolmark82

I am planning on staying with my computer tech stuff for years and years to come, but Zettabytes is a word i hope not to come across except for when Apple releases it's new line of Macbook pros with 250 zettabytes of HDD space, 10000 zettabytes of RAM, an optional 80" holographic screen, a touch-sensor light up keyboard, mid sensors, and much more stuff I might not see for a while. I find it a little strange why do we need all of this storage space. Although I welcome it, I barely use my 14TB HDD array at home let alone 35 zettabytes in the cyber world. i could never put my mind to what the new computers would look like in the future.... hopefully nothing from back to the future part II. Maybe a spin off of the iMac G4 would be nice. Just put in an intel 128-core i99 processor in it running at 77.25Ghz, and you'll be fine.

AstroCreep
AstroCreep

...and oddly, it's still around. Why? Capacity, speed, control, and ease of use. My company Replicates data and makes use of Snapshots (VSS and SAN-based) along with regular nightly backups. Some of the nightly backups are to disk (SAN) and others are to tape. Strangely, my nearly five-year-old LTO 2 drive out-performs my backup to disk jobs from a MB/minute rate stance. Furthermore, how will someone be able to manage the nightly "removed" media? My backup runs tonight and what do I take with me tomorrow? Copying upwards of 1TB to a single removable disk will take a significantly longer time than it would to copy to multiple tapes (loader) in a compressed format, and for many of us that use Windows know that even though a drive is "Removable" Windows may still have it set for "Use". For people who want more control over the way their backups work, I don't see tape going away anytime soon, as it's been predicted to in the last eight years or so.

MikeGall
MikeGall

Tapes are very fast. At my last work we had a SL3000 from SUN with 4 LTO4 tape drives (expandable to 16 I think). Tapes are: cheap: ~30 euros for 1.6TB of capacity (with compression) fast: ~240MB/s with compression and fairly consistent speed once writing verses disks which are faster on the outside than on the inside of the platter. LTO 5 is touch faster at 280MB/s and 3TB capacity. low power requirements: we had 2 admittedly big, but still just two, plugs into our library with a capacity of 3PB and a combined throughput of 1GB/s (not 1Gb, 1GB) expandable to 4GB/s without the need to add another rack/breaker to the server room. How much power would you need to keep 3PB of disk spinning? scalable: Want to start using a faster generation of LTO? Upgrade your tape drives and presto you can read your old tapes and start using newer bigger faster tapes for your backups. Versus a disk based system where you'd likely need to get new disk arrays and add disks in blocks of 20 or so since you can't raid disks that are different speeds (at least not and enjoy the speed boost). Sure you can get hot swappable disks but in my experience at least they aren't overly reliable (~10% of the time I had the LUN become unreconizable without unmounting and remounting the array) and you are using a technology that is meant to be always connected versus one that was always meant to be hot swappable. reliablity: tapes are very simple technology no (or at least simple) barings to wear out, no motors or controllers to fail, built in RFID to track tape labels and bad blocks etc.

steve
steve

An LTO tape has a rated archive lifetime of 30 years. Your high tech disk array will be toast in about 5. The SAN guys I work with figure 3 years use for a new Disk array. 12 months to transfer all the data from the old one to it, 3 years of active use and 12 months to transfer off to its replacement. Now you may have trouble finding an LTO2 capable tape drive in 30 years, or even getting it to talk current protocols at that point, but the data will still be readable. Try that with disk!

Editor's Picks