A disaster plan is the type of document you hope you never need. But a disaster plan is a key element of systems administration strategy because of the risk of disaster presented by hardware failures, natural catastrophes, network intrusions, and human error. The risk of disaster is particularly high for databases, which in many companies are mission critical. When a disaster strikes your SQL Server databases, you need a plan and a way to recover from that disaster.

Back up your database
This one should be pretty obvious. A good, working backup is essential in order to recover from most disasters. However, a backup will also help protect you against database upgrades that go bad, table corruption, and other system problems.

If you are running a Microsoft SQL Server database that must run 24/7, I highly recommend purchasing and using an active backup agent. If you attempt to directly back up an active database without using an agent, your backup will be corrupt and unusable. All major backup software products including Arcserve and BackupExec have this as an option. Agents allow you to properly back up a database without having to bring down the database server processes.

If you do not have the budget to get your hands on the agent software, an alternative is to bring down the database processes and copy the database files to a different server. Once the copy is complete, you can safely restart the database server processes. This strategy will work for both SQL Server as well as the lightweight MSDE—SQL Server desktop edition.

A third option is to use the SQL Server Enterprise Manager’s backup utility. This allows the backup of an online database and allows the scheduling of the backup in order to automate the process. To back up a database, start the Enterprise Manager and browse to the Management folder under the SQL server, which hosts the database that you wish to back up. Right-click the backup folder and choose Backup A Database from the shortcut menu (see Figure A).

Figure A
This shortcut menu is part of Enterprise Manager.

For this example, I will back up the master database to a disk file that will be subsequently backed up to tape during the normal backup job. In order to do this, I need to create a “backup device”. This is a virtual device that SQL Server will use to place the contents of the backup job (See Figure B).

Figure B
The next step of the database back up includes creating a “backup device.”

To add a device, click the Add button under destination. Second, choose the file name option in the backup destination window that comes up. Third, choose the file selection box. Finally, type in a descriptive name for the backup device. Select OK on all windows until you return to the database backup options window.

You have the option to overwrite each night’s backup or to append it to the existing file. Finally, you have an option to set up a schedule so that your database automatically backs up. Select OK to complete the process (see Figure C).

Figure C
Select OK to complete the database backup progress

In my organization, I run the first and second types of backups. One of our Microsoft SQL Server database servers is loaded with an Arcserve agent, which negates the need for me to intervene to handle the backup. The other database is also SQL Server, but is the MSDE edition (desktop), which is bundled with one of our applications. Since this product does not need to be available 24/7, I run a task each night that stops the database server and copies the raw files to a different disk. That disk is, in turn, backed up by the nightly Arcserve job.

A good backup will likely be one of your needs in the event that something bad happens to your database. If you lose a database or a table becomes corrupt, you can reload the data from this backup. If you lose your entire server, you will have to set up a new one and reinstall your backup software before you can make use of the backup.

The testing of a backup, a task that is often overlooked, is absolutely critical. A backup can easily be tested by restoring a file or database from the backup media onto a different server or into a different database.

You know the sinking feeling that you can get in your stomach when you realize that something is really not right?  Imagine having to rebuild your company’s database server with financials for the past 6 years only to find that the backup has never really worked. The backup software log said it backed up, but none of the media can be read. For the sake of your career and the well being of your company, it’s important to make sure that a database can be both restored and used.

Write it all down
The next important aspect to SQL Server disaster recovery is to have detailed documentation for your environment. I like to approach documentation from the assumption that I will eventually have to completely rebuild the network infrastructure from scratch. As such, I like to document absolutely everything about a server including patches installed, IP addressing, etc. For SQL Server, it is especially important to keep track of patch information since certain stored procedures change with each service pack.

Besides documenting the SQL installation, it’s also important to keep track of OS and hardware items as well. If you ever need to rebuild your database server, it would be nice to be able to prove that the box originally had four processors, for example. In addition to the spreadsheet, keep diagrams of database and table relationships as well as of your network. Rebuilding this information later on is much easier if you have a picture to guide you.

While a system will generally be able to be recovered eventually even without good documentation, for many companies, downtime equates to lost revenue. Being able to save significant time in a rebuilding process by making use of documentation and well-thought-out diagrams will restore that revenue stream much more quickly.

Log shipping for Enterprise Edition users
A disaster is not necessarily an asteroid nailing your data center – although that really wouldn’t be a good thing. A disaster can be defined as any incident that interrupts the operation of your SQL Server. As such, part of your disaster recovery plan should include high availability services, especially for mission critical data.

One such service is the log shipping service. This service periodically sends transaction logs from a “hot” SQL server to a “warm” SQL server. The transaction logs are used to keep the warm server synchronized with the hot one. If the hot server fails, you can switch to the warm server with very little interruption in service. See Microsoft’s Web site for details on this service.

Keep in mind that this feature is only found in the Enterprise Edition of SQL Server. Log shipping is only one small piece of the disaster recovery/high availability puzzle. For a plethora of information on keeping your enterprise running through catastrophes with redundancy, Microsoft has a high availability service planning and deployment guide available for download.

As with good documentation, a high availability service will reduce downtime. The only downtime that should be suffered is the time it takes to switch from the primary to the backup server.

Keep the data current
At this point, you’ve set up a great backup scheme, tested it and created a complete set of documentation. You’re ready to go home when you get a call…the SQL server is down. Upon further inspection, you find that the database is just gone. Being prepared, you restore last night’s backup. But there is a kicker because a database is always a moving target. Restoring last night’s backup at the end of today could result in the loss of a multitude of data. As such, it is important to back up the changes between the most recent regular backup and a failure. This can be accomplished by backing up the database transaction logs. By default, however, a database’s transaction log can’t be backed up.

In an initial installation of SQL Server, each database is assigned a simple recovery model unless you change it. A Simple recovery model means that a point-in-time backup is the only one supported. Data added or changed between the last backup and the time of a failure is at risk of loss. To change a database to Full recovery, which allows the restoration of data all the way up to the point of failure, you need to open the properties of the database and, on the Options tab, change the recovery model from Simple to Full. If you feel it important to have a full recovery model in place, I recommend that you back up the transaction log more than once a day.

Changing the recovery model from Simple to Full allows a database restoration all the way up to the point that a database failed. The administrator needs to restore the last full backup and then the transaction logs will bring it up to date. Keep in mind that a full recovery model will result in very large transactions logs if you never do a full SQL-based backup.