Disaster Recovery compare

Five tips for optimizing your cloud backups

You might be amazed at how long your cloud backups can take. Brien Posey shares pointers based on some performance constraints he ran into.

Those who have never done cloud-based backups before are often surprised by how slow the backup process can be. Here are a few tips that will help you to overcome some of the performance constraints you are likely to encounter when backing up data to the cloud.

1: Back up the most important files first

When you sign up for a cloud backup service, the service will have to make an initial backup before it can begin backing data up incrementally. Depending on the amount of data that needs to be backed up and on the speed of your Internet connection, this initial backup can take a long time to complete. In fact, when I began backing up my own data to the cloud, my initial backup took three and a half months to complete.

Because the first backup can take so long, it is important to prioritize your data. For example, I configured my cloud backup client to back up all of my Microsoft Office documents first, followed by image files. I placed a low priority on executable files and unrecognized file types so that those files would not be backed up until everything else had been backed up.

2: Take advantage of bandwidth throttling

Given the fact that an initial cloud backup can take so long to complete, it seems insane to even consider slowing the backup down on purpose. Sometimes however, that is exactly what needs to be done.

When I first began backing up my data to the cloud, I didn't bother to use the bandwidth throttling feature that was included in the client software. Unfortunately, the backup client consumed so much Internet bandwidth, there was a major impact on most other Web services. Even the simple act of browsing the Internet became painfully slow.

I quickly enabled bandwidth throttling and found a setting that struck a good balance between backup performance and having enough bandwidth for everything else. Although this new setting seemed to work well, I eventually had to cut back on my backup client's bandwidth consumption again.

A few weeks after I began backing up data to the cloud, I received a letter from my ISP. The letter indicated that I was uploading more data per month than my service contract allowed. My ISP threatened to terminate my service if I did not get my bandwidth consumption back down to an acceptable threshold. Unfortunately, this meant further reducing the speed of my cloud backup.

3: Give priority to newer files

Just as you may be able to tell a cloud backup application which types of data you want to back up first, you may also be able to set the software up to give a higher priority to new files.

If you really stop and think about it, a user is far more likely to ask you to restore a file that was created yesterday than to request that you restore a file created five years ago. So it makes sense to configure your backup software to immediately back up any newly created files, even while the initial backup is still being created.

I used this technique in my own environment. My reasoning was that all of the old files on my network had already been backed up to removable media. I still wanted those files backed up to the cloud in case my facility were ever destroyed in a fire or hurricane, but my more immediate need was to back up new files that had never been backed up before. Placing a high priority on newly created files helped me protect that data even while my initial backup was still being created.

4: Deduplication is essential

Because the process of creating an initial cloud backup can take months to complete, it's a good idea to minimize the amount of data being backed up. This is especially true for anyone who is paying for backups on a per gigabyte / per month basis.

One way to decrease the amount of data being backed up (without sacrificing protection) is to use deduplication. Different cloud backup providers implement deduplication in different ways. Some providers will back up each file only once. If the same file exists in multiple locations, pointers to the file will be created. These pointers create the illusion that the file has been backed up separately for each location.

Other cloud providers perform block-level deduplication. Rather than skipping duplicate files, the client software creates a checksum for each block that's being backed up and then uses the checksum value as a way of determining whether a duplicate block has already been backed up.

5: Keep backups on site as well

I recommend that you continue to create and store backups on premise. Cloud backups are great for situations in which an entire datacenter is at risk. However, if you ever have to recover a server, it will almost always be faster to restore the data from a local backup than to download the backup data from the Internet. Creating a separate backup that you can store on premise can be a pain, but it is worth the effort, since doing so can help you to recover from non-catastrophic disasters more quickly.

About

Brien Posey is a seven-time Microsoft MVP. He has written thousands of articles and written or contributed to dozens of books on a variety of IT subjects.

3 comments
nikunjverma
nikunjverma

The above 5 points could be lifesavers when you truly have more data or have truly insufficient bandwidth speeds. However, do note that even if you have the sufficient bandwidth, your cloud backup solution might not use it efficiently. This is because some products enforce artificial speed limits once you cross some usage threshold and some products simply are not efficient enough. We have just announced Zmanda Cloud Backup 4 which features configurable multithreading to speed up the data transfer speeds to cloud. And of course, it offers options to store data on-premise, enforce bandwidth throttling, perform incremental/differential backups, etc. so that you can also make use of the 5 ideas suggested in this blog post. You can read up more here: http://www.zmanda.com/blogs/?p=435 b4real makes a good point too - "delete restore" support is important and surprisingly many cloud backup solutions simply delete files from the cloud once they get deleted from the user computers.

b4real
b4real

I'd add that the smartest tools will be able to perform block-level change hashes; rather than re-transfer an entire file. Another thing to look for is to ensure that the backup mechanism can support delete restore. What I mean by that is if file A is deleted from source, the cloud backup solution doesn't correspondingly remove it from the cloud repository. There are cloud backup tools that work that way.

seanferd
seanferd

Some cloud backup is actually cloud mirror? Do these solutions offer this important bit of info up front? I hope so.