Disaster Recovery

Deduplication technologies are all the rage

Deduplication (dedupe) is all over the place these days, with systems and services emerging on a variety of fronts. Several large companies are offering online backups using dedupe technology, and there are dozens of products that can back data up to tape or disk after deduping.

Deduplication (dedupe) is all over the place these days, with systems and services emerging on a variety of fronts. Deduping is the process of removing redundant data like operating system files, files that haven't changed, or files stored in multiple users' folders and e-mail archives, so that those files are only backed up once. Several large companies are offering online backups using dedupe technology, and there are dozens of products that can back data up to tape or disk after deduping.

Corporate IT warms up to online backup services (Computerworld)

Deduplication is becoming necessary technology in the data center, with every company I know of dealing with explosions in data storage. Backup management can be extremely complex and the best backup strategies can come at a very high price. Hitachi recently inserted themselves in the market with virtual tape library appliances and NetApp has introduced tools for recovery of virtual servers, another technology that is becoming necessary for many organizations.

Hitachi offers plug and play VTL appliance with deduplication (Computerworld)

More Choices in Backup Systems

We have had a deduplicating backup system for a couple of years now and we are in the unenviable position of having to expand as a result of the explosion of data over the last couple of years. I am in the process of crafting a backup strategy that will not break the bank, but still provide the level of service that our dedupe backup system has conditioned our users to expect. My short term headache is that I will probably be forced to start using tape for some backups until our current system can be expanded. Have you explored the deduplication options?

--------------------------------------------------------------------------------

Stay on top of the latest tech news

Get this news story and many more by subscribing to our free IT News Digest newsletter, delivered each weekday. Automatically sign up today!

17 comments
brad
brad

We do AS/400 disk to disk backup software, and one of our customers turned us on to Data Domain deduplication systems. Basically you have one onsite, and another at your DR site, and you synch them. Then everything you backup to the local one gets deduped at 25 to 1 or better, and the changes are sent to the remote replication server, and you have a complete offsite backup, automatically and painlessly. SAVLIBLV *ALLUSR once a day, and your backup is done and your offsite copy is safe.

Ike_C
Ike_C

I'm missing something. You say this is new, but I've used Backup Exec for years and had it set so that it would only backup those files that had changed, and it didn't take much time. Then every so often the schedule would run a FULL backup where everything was backed up, and that would take a long time.

eclaymoore
eclaymoore

I think you are missing the point a little bit. If you are backing up three servers that all have Windows Server 20003 and the same patch level, there will be a large number of files that will be identical between the two machines. What the article is talking about is that the backup solution only captures one set of these files and that it simply adds an extra reference to the already backed up file if another machine has it also. To take it a step further, online backup services can De-Duplicate across all of their clients and possibly publish a list of the files they have to the client so that they are not even transmitted for backup in the first place. Hope this helps.

Ike_C
Ike_C

Now that makes better sense. I now wonder if anyone has had to do a true disaster recovery restore of servers using this deduplication method. I have and Backup Exec makes it quite difficult, not as easy as 1-2-3, but it worked.

dawgit
dawgit

That might be, that's the way the market seems to be working, but... Does this / will this comply with existing regulations on Data retention? Does this method have a fail safe process in case of a 'worst case' scenario? Where duplicate, seperately stored Data is usefull? I'm not convinced easy is always better. -d

LocoLobo
LocoLobo

Does the deduplication process have to hunt for duplicate data every time it backs up. Does this cost much in time? What about restoring the data? Does the software have any difficulties getting that 1 piece of data out to the 41 different locations it needs to be? Perhaps in the future we can come up with software that does the "deduplication" from the start. Before we do the backup. Maybe this will be done through ECM but I think of it as "Organization".

Andy Moon
Andy Moon

Our solution has an agent that runs on each of the machines I back up and the agent de-dupes the data before we send it to the backup server, saving storage space, network bandwidth, and backup time. The solution also has a restore interface where I navigate through the UI, find the server and file I want to restore, then the system recreates the file out of the de-duped pointers and unique data automatically. I can restore to any location I like (even restoring one user's mail to another mailbox if necessary) and the entire process is lightning fast. I dread having to use tape again.

LocoLobo
LocoLobo

I hate tape too. But for now we are stuck with it. Online backups are out. Some data is backed up to external HD separately.

BButen
BButen

What product do you use?

jmacsurf
jmacsurf

Symantec Enterprise Vault 7.5 SP2

Andy Moon
Andy Moon

...that is built into the Avamar product, but every time we propose it, the project gets nixed by upper management. We keep telling them about the consequences, but so far it has fallen on deaf ears.

eclaymoore
eclaymoore

I love these technologies, but how are you incorporating these into a disaster recovery strategy that allows a copy to be stored off site? EMC Replication? Copy to tape later? What?

Andy Moon
Andy Moon

We are using a product by a company called Avamar, which was purchased last year by EMC. We have been using it for three years this month.

BBPellet
BBPellet

We use a Data Domain DDR Appliance.

Andy Moon
Andy Moon

Our deduplicating backup system has truly been fantastic for us. Our full backup window has gone from ~12 hours to less than one, recovery times are measured in minutes rather than hours, and I haven't had to touch a tape in about three years. How do you think de-dupe would work in your environment?

jmacsurf
jmacsurf

Hi Andy, Thanks for posting that. I've been trying to get my company to fund an archiving solution so we can take advantage of true single instance storage technology for file and exchange (beyond that of lousy built-in SIS) and of course reap the great benefits deduplication. -Jmac

Andy Moon
Andy Moon

I love our system and just 5 minutes ago got approval to expand the system, which will double our storage capacity and keep me from having to use tape!

Editor's Picks