One difficult aspect of disaster planning that will probably be the first thing you have to tackle and will determine everything else that you include in your plan is the service level agreement (SLA) that you negotiate with your users. SLAs are essentially the promise you make to your end users about how long a system will remain unavailable during an emergency. SLAs are made up of Recovery Time Objectives (RTOs) and Recovery Point Objectives (RPOs) and are often highly influenced by end-user perspectives and prejudices, making them a very difficult concept to deal with on a technical level.
RPO is a measure of the amount of data that can be lost to a disaster. For example, if you use tape backup once per day, your potential RPO is one day's worth of data if the disaster strikes at the worst possible time. RTO is the measure of how long the systems can be offline during a disaster. An example of this is the amount of time it would take to bring the standby systems online with a replication and failover solution. These two metrics will allow you to create a measurable SLA that can be presented to the end-user community, letting them know when their systems will be back online and what they can expect to see when the process is complete. However, these metrics alone can't help you if you don't know what your end-users expect from the DR systems to begin with.
End-user requirements are a double-edged sword. On the one side they can provide you with definite guidelines as you begin to determine how quickly these systems must be back online. On the other side, end-users tend to be unrealistic in their demands for zero data loss and instant failover. While that can be accomplished in only a small subset of cases, the vast majority of data systems cannot possibly withstand these types of failover "requirements" due to the operating systems they run on, the structures of their data systems, or the very nature of the tools that would be required to perform these operations.
Of course, your budget also comes into play in SLA discussions. The closer you get to a zero-loss number in RTO and RPO, the higher the cost of the overall solution. The way the cost-curve is based, if you go much below the average allowances in either RTO or RPO, then you're looking at astronomical jumps in funding requirements. Once they see the budget, end-users often radically revise their SLA requirements, which opens many more options for DR planning.
SLAs can offer a great way to let your end-users know exactly what will happen in an emergency and how quickly they can anticipate getting back online. Involving end-users from the start, educating them about budget and technology, and making sure they remain informed is vital to the creation of a valid SLA. DR planning is for the benefit of these end users, and an SLA can realistically set their expectations and define their roles during the planning process.
How well can your organization deal with an emergency? Automatically sign up for our free Disaster Recovery newsletter, delivered each Tuesday, and make sure you're prepared for the next catastrophe.