Data Centers

Disk-based backup data: Where to put them?

This is an age-old question, and the answer may not be as simple as you think. Rick Vanover shares his thoughts on the best way to provision storage for availability.

I don’t know about you, but every time I turn around I see a new way of doing storage that I have to sit down and think about. Things such as new protocols, new products, and new disks make our ‘standard’ processes subject to revision at any time. One thing I’ve learned over the last few years about disk-based backups is that there are plenty of things that can go wrong. So, I thought it worthwhile to explain a few points that I’ve learned over the years so you can help provision your storage resources to avoid some of the pitfalls I’ve seen (and experienced myself!).

NFS vs. CIFS

The first storage question I usually address is, “Should I use NFS or CIFS?” CIFS stands for Common Internet File System and the newer version of that protocol is SMB (Server Message Block). NFS is Network File System and has deep UNIX and Linux roots. But it is important to note that CIFS isn’t really CIFS anymore, and it’s now effectively SMB, making CIFS uncommon. CIFS is an older file protocol that Windows has supported, but now heavily favors SMB. We may naturally prefer to use CIFS because it is easy. I spend most of my time in Windows, and NFS from Windows is really not pleasant (though it is much better with Windows Server 2012).

So, because of this plight; many people choose to put their backups on a CIFS or SMB resource. That’s all fine until one critical situation. The backup that needs to be restored is part of the Active Directory domain. When it comes to accessing this storage, simply using CIFS or SMB may require that Active Directory be running. The moment your restore scenario becomes hosed because you can’t log into the storage resource, you can realize the problem. In this situation where the storage resource requires Active Directory to access the backup data for its CIFS or SMB implementation, NFS may be a better choice. Since it uses Linux authentication, Active Directory isn’t required. This is especially helpful should the restore be of Active Directory. Trust me: I’ve learned that one the hard way.

SAN gotchas

Another tip related to disk-based backups I’d like to share is considering domains of failure on block-based storage resources such as a SAN. I’ve seen a number of situations where a SAN controller provides multiple disk tiers or arrays and backup data is put on the different disk channels. This could be as simple as the SAS (higher speed, higher price, lower capacity) drives contain the running data profile (VMs, servers, LUNs, etc.) and the SATA drives (lower speed, lower cost, higher capacity) contain backups of the other disk profile. That’s great until the SAN controller fails, making both drive arrays inaccessible. I know most storage systems are built with dual-controller systems, but if it can go wrong, it just may. Same goes for the storage network in place. If the storage network itself (Ethernet, iSCSI, Fibre Channel, etc.) fails, does that remove access to storage for the recovery scenario?

There are plenty of things that can go wrong, but what do you do to your storage for disk-based backups to ensure that they are kept available? Any tips you can share? Start a comment below!

About

Rick Vanover is a software strategy specialist for Veeam Software, based in Columbus, Ohio. Rick has years of IT experience and focuses on virtualization, Windows-based server administration, and system hardware.

9 comments
G33k0ut
G33k0ut

Many servers these days come with internal USB, get a USB to SATA converter and attach it internally to a hot swappable drive bay like the ones made by Icy Dock. Use this like you would a good old fashion tape drive. Backups can be written to these drives and stored in a safety deposit box (couriered by someone you can trust). I always use bare metal backups because I can do a p2v with them and provides me with more options when trying to recover from a disaster (shadow protect or acronis are good options). Keep a second onsite backup for day to day restores when they might happen and for disasters that hopefully don't. Third pick an online service that is a SAS 70 II datacenter for backups of key individuals data from the network and personal computer. Give this person the power and control to choose when and what to backup but be careful not to tear down the years of training (if you don't keep it on the network I can't back it up).

bart001fr
bart001fr

The first thing you should ensure is that your network is gigabit capable so bottlenecks shouldn't happen. And gigabit switches and Cat 6 cabling aren't that costly. @swenkruiper, But what if part of the OS of the NAS is that it formats any drives it first encounters, as an automatic feature, when the drive is first entered into the NAS? This is the case with my HP NAS and it is part of the OS that was with the machine when I bought it. Granted, I suppose I could have changed the OS and used a Linux OS instead of the Windows Home Server that came with it, but then there goes the warranty. And since I'm new to the idea of using a NAS I still have some studying to do, as well as some upgrading of the machine to seriously consider, I have come to realize. But wait... is that "format a newly inserted drive" part of the OS, or is it part of the NAS's BIOS? HP doesn't explain any of that in their literature. Whatever the case, if I did as you outlined, I think I would find myself with a set of newly formatted drives with no data on them at all. So I figure that one's best bet is to build his own NAS from standard parts and put it together with good software, learning something very useful in the process. Start with a case (3 - 5 1/4 bays needed) and micro-ATX mobo, add a 5-disk RAID cage (about $120) and Linux OS to drive it. There are NAS-specific Linux systems out there; you only have to search for them. Or you can buy Windows Home Server software. But I don't know if you [i]can[/i] customize it like you can a Linux server system. I have also run across systems with more than 5 drives, but they cost upwards of a thousand dollars, so you are getting into enterprise-level storage costs. If you're in need of this level of backup security, you should seriously consider the enterprise-class of SATA hard drives and not use anything bigger than 1TB. A 2-terabyte drive take 24 hours plus to format! I shudder what 3TB would take. And at the quantity of drives needed (3 at a time: 1 for data, 1 for mirror and 1 for emergency backup (needed on-site and immediately) if one of the primaries fail), they would be cheaper.

dpthakar
dpthakar

I would recommend to use samba on linux rather using active directory for restore. also cifs becomes easy for windows admins.

Soul--Reaver
Soul--Reaver

The good thing about a NAS is that it has multiple entry points. My NAS supports Samba, AFP, NFS, iSCSI, telnet, ftp, SSH If the NAS itself would fail then a different NAS of the same brand (might also require the same processor type) can be put into place with the same disk drives and after setting up to NAS the same data should be available again

donandmichelledudley
donandmichelledudley

Over the many years I have been in IT there isn't a good way to be able to backup disks and store them off site. I have tired the company vault and found that the guy that was taking them to the bank vault wasn't going straight to the bank and sometime the disks were setting in the front seat of the company car. This isn't good when you have HIPAA and PCI compliant data. Then I tried storing them off site at another one of our locations and when I went to retrieve them they weren't in the vault at the site as they were supposed to be they were just left on the Director's desk. Again, not good for supposed to be sure data. So I don't have a good answer just more input. I will be looking to see if there is a good offsite storage as the online backup's are way too expensive.

joe
joe

agree with joycem. it is a basic mechanical engineering rule: all machines fail. so there is no question about whether your current storage device or any one backup may fail, but rather when? so I use two, one local (usually to an external HDD) and a second offsite (online service) backup. this approach has saved my 'bacon' more than once.

beck.joycem
beck.joycem

Whether you have a simple little USB hard drive or a building full of the most sophisticated tech imaginable, there is always the possibility of it going wrong - either in the technical sense of hard drive failure (or whatever) or the human sense of somebody messing it up, accidentally or deliberately. So I always recommend two independent backup strategies, which do not use the same hardware or software, and are not kept in the same place.

beck.joycem
beck.joycem

. . . you have a major network issue? if your router is essential to both backups they don't obey my independence rule.

b4real
b4real

Don and Michelle - best answer I've seen is storage replication or scripting. Both take a lot of bandwidth however.

Editor's Picks