Get the details on how VMware's vCloud Hybrid Service - Disaster Recovery service works, including how it handles testing, failover, and failback.
VMware recently came out with a disaster recovery (DR) service called vCloud Hybrid Service - Disaster Recovery (vCHS - DR) that utilizes its vCloud Hybrid Service option and its on-premises solution vSphere. VMware is touting the service as a good DR solution for SMBs or for Tier 2 applications at larger enterprise companies.
How vCHS - DR works
vCHS - DR uses vSphere Replication to copy in-house virtual machines (VMs) to rented storage and compute in one of VMware's vCHS locations. In order to do this, you download the specific DR vSphere Replication virtual appliance and deploy it using your vSphere client. Then, by using the web UI for the vCHS part, you can get the connection information you need to configure the vSphere Replication appliance.
The appliance basically streams data to vCHS, meaning the appliance does not have to be extremely large to store the replicated data, so it's not taking up much space in your internal environment. The appliance can handle up to about 500 VMs, and if you need more than that, you only need to download another appliance. The VM protection (by which I mean the VMs that are replicated) are chosen on a per VM basis.
The lowest recovery point objective (RPO) you can get with vCHS - DR is 15 minutes, which should come as no surprise to anyone already using vSphere Replication in-house. If you're concerned about the time it takes to initially sync, VMware has thought of that already as well. It is possible to do a one-time seeding; this means VMware will send you some sort of Network Attached Storage (NAS) appliance in which you can copy your data. Then, you send the NAS appliance back to VMware to hook up on site in the vCHS location for faster replication. Once that data is synced, you can begin using the normal vSphere Replication.
Testing and failover
VMware offers two tests per year as part of the core-pricing package -- this allows you to file a ticket with the vCHS support team, and they will have people ready for planned testing. There is a test button that is somewhat similar to how Site Recovery Manager (SRM) works that offers you a sandbox network to test applications and OS. Though there are some similarities between SRM and vCHS - DR, they are not integrated at this time.
In the case of an actual failover scenario, there are two ways this can be handled: You can initiate a failover from the vSphere Web Client or from the vCHS - DR web UI. If this is not a planned failover, it might be assumed that your in-house VMs, including vCenter, won't be available -- hence, the ability to also do a failover from the vCHS - DR web UI. Once the failover is initiated, the VMs in vCHS will change from placeholders to actual VMs with proper networking to the outside. There's still cleanup to do, which entails changing IP addresses and DNS entries maybe internally and externally. This could be scripted to automate the procedure.
In the current version, failback is possible, but it's not necessarily easy. To failback, you need to download the vCloud Connector (free) appliance from VMware to your vSphere infrastructure. You will use this appliance to bring the VMs back from vCHS to your internal environment. One caveat is that vCloud Connector requires the VM to be turned off to make the copy, so downtime will depend on how large your VM is and how much network bandwidth you have. Again, you will need to change IPs and DNS entries back to what they were originally.
vCHS - DR takes a lot of complications out of a typical DR run book, though the problem remains that there's still no easy way to work with domain controllers. If your VMs rely on Active Directory connectivity, this will need to be addressed.
Since you cannot keep any live VMs in vCHS - DR unless there's a failover event, you may not have a domain controller (DC) at the ready. VMware offers two possible scenarios: You can rent cabinet space in the same physical location as your vCHS - DR and use a technology called Cross Connect to connect your DC (at the physical location) to your vCHS - DR instance; or, you can purchase one of VMware's other vCHS services (which allows you to keep live VMs running in the cloud), and then create a new DC VM to connect to in case of a failover event. In my opinion, neither of these options are stellar, so if you can avoid the DC situation, I would try to do that first. I imagine you could also replicate a DC VM from your in-house environment, but that would require changing IPs on the DC and perhaps other site and DNS information, which could prove to be more trouble than it's worth.
Check out the VMware site for pricing information.
I think vCHS - DR is a slick solution, though I've not had the opportunity to try it out yet. It minimizes cost by cutting out the need for expensive block storage replication (as long as you can accept the 15 minute RPO), and you don't need to be a storage or a DR expert to make it work. Of course, DR always requires planning and should be tested no matter which solution you use.