Mark Pimperton describes how his company's move to a disk/online-based backup system has made file restore easier and provided much improved disaster recovery.
It will come as no surprise that, having managed tape backups for more than 10 years, I've long had an ambition to get away from them. We began with DDS-3 tapes, which had an annoying habit of getting stuck in the drives. The larger DLT and LTO tapes didn't get stuck and were certainly faster, but still prone to the same inherent limitations. Brand new tapes were sometimes unusable; read or write errors could appear after a few weeks, also rendering them useless; even good ones had to be replaced every few months; and a few times a year we'd simply forget to insert them. (We aren't big enough to warrant an autoloader.)
Restoring files did work but was time-consuming -- assuming we could find the right tape and assuming the backup data could be read OK. Bare metal restore of a whole system was possible in theory but never tried in practice as I couldn't risk wrecking my production servers and didn't have spares to try it on.
Having said all that, the Yosemite Server Backup software we used wasn't bad, although Barracuda's support left a lot to be desired. And because we ran full backups to several tape drives overnight we knew we wouldn't have to try to string incremental backups together from several tapes. We did, however, start to struggle with backups taking longer and longer, even to our fastest LTO-4 drive.
My first attempt at moving to disk-based backup in 2010 involved a Buffalo TeraStation NAS. I had visions of keeping day to day backups onsite on the NAS and just doing weekly tape backups to take offsite. Unfortunately Yosemite Server Backup wouldn't play nicely with the device, and in any case the backup over the LAN to the Buffalo was so diabolically slow as to be completely impractical.
Later that year I learned about managed backup services combining onsite disk-based backup with offsite online backup. I didn't like the idea of backups being only online as I envisaged simple restores being slower, and whole system restores being extremely slow. But a system using both would eliminate the need for a separate tape backup physically transferred offsite.
After getting a few quotes I had a trial of the Barracuda Backup service. I quickly came to two conclusions:
- Our Internet connection was too slow to cope with uploading even compressed, deduplicated, and incremental backups.
- Barracuda's service in the UK was less than stunning.
It was almost another year before we upgraded our broadband, giving us five times the upload bandwidth. I went back to the market, including a managed service provider (MSP) who'd previously quoted me. They said they were offering a new service from Datto Backup whose price and performance were even better. After extensive commercial and technical discussions, we signed up.
We've been running for a few months now, and the system has met all my objectives:
- No tapes to insert, rotate, replace, or take offsite.
- Quicker file restore.
- Improved recovery points -- we can take several snapshots during the day.
- Disaster recovery of a failed server within 30 minutes by booting the last backup as a virtual machine (VM) locally on the backup server. (Previously this would have been between half a day and two days.)
- Business continuity in the event of a site disaster by booting backups at the online data centre as VMs accessed over VPN. (Previously to recover all servers would probably have taken us several weeks.)
And yet, it's not been an entirely smooth journey. The first problem was that between us, the MSP (GCI) and I managed to under-specify the size of the backup server. We were in discussions over several months and somewhere along the way we got our calculations wrong. Starting from the amount of data you want to back up (taking into account that it's not possible to exclude folders, only entire volumes), you then have to make an educated guess as to the likely size of the compressed and deduplicated backup data. Crucially, you must then add an allowance for the additional snapshots of each machine. With tape, we could go back in time by picking the right tape (and we had about 11 tapes per server). With a disk-based system, we go back in time by setting a data retention period and storing the changes.
GCI's rule of thumb was to say that the backup server should have about twice the capacity of the raw backup size. In our case that meant we should have gone for a 3Tb model, not the 2Tb we chose. To be fair to them, they gave us a good deal on a new contract for the bigger device, and once that was installed we had enough capacity.
To start using the system the server takes an initial backup of all protected machines. Datto then provided a USB hard drive to copy that initial "seed" backup to, and once the transfer is complete the drive gets sent to Datto's data centre. Without this step, it would take several weeks to upload the data. Unfortunately we had some hiccups with this process as well, with failed data transfers and, in one case, an under-sized USB drive. Once Datto received the data, it seemed to take a long time for it to be processed, but once it was done Datto enabled the off-site synch from our backup server. This now has no problem keeping up with the data changes, and for each protected server we can set how often to send data changes offsite.
The backup software and process isn't perfect and backups do sometimes fail. Datto releases periodic firmware updates for the backup server and updates to the backup agent. The biggest technical hitches we've had relate to booting VMs of the images locally on the backup server. These have mostly been with Windows 2003 servers but occasionally with Windows 2008. Some of these have been resolved, but in a few cases we resorted to exporting the backup image as a .VHD file and importing it into Hyper-V on another host server to prove it would boot. (The Datto backup server uses Oracle VirtualBox.)
The managed backup service has taken some getting used to, but it is providing all the business benefits we wanted. We've yet to test the access to remote VMs in the cloud, but other clients have used this successfully. While not perfect, the technology and the service are improving all the time, and we would definitely not want to go back to the days of tape.