New to SQL Server 2008 is the native ability to compress database backups. This great new feature results in a smaller (sometimes much smaller) backup file, which saves disk space, and can lessen the time latency involved in some disaster recovery schemes, such as log-shipping.

Backup Compression

As more and more information is stored in corporate databases, storing database backups becomes increasingly important. The problem is that these backups can take up a sizable amount of disk space. There are third party tools such as SQL Backup from Red Gate software that will compress these backups for you, but you have to purchase them. SQL Server 2008 Enterprise Edition now provides this functionality as part of the BACKUP DATABASE statement.

It is important to note that this article solely references the new backup compression feature in SQL Server 2008 Enterprise Edition. The new SQL Server production also includes functionality for database compression, which I will outline in a future article.

In the following example, I take a look at how backup compression works, and compare the results from the compressed database against a normal full backup created in SQL Server 2008.

Getting Started

First, I will create and load a table in my database. Note that the table below contains one variable character field. This is significant in that the more character fields your database contains, the more compressed your database backup will likely be.

DROP TABLE SalesHistory
CREATE TABLE [dbo].[SalesHistory]
[Product] [varchar](10) NULL,
[SaleDate] [datetime] NULL,
[SalePrice] [money] NULL

Loading data

Now I’ll load about 30,000 records into my SalesHistory table.

SET @i = 1
WHILE (@i <=10000)
      INSERT INTO SalesHistory(Product, SaleDate, SalePrice)
VALUES ('Computer', DATEADD(mm, @i, '3/11/1919'), DATEPART(ms, GETDATE()) + (@i + 57))
      INSERT INTO SalesHistory(Product, SaleDate, SalePrice)
VALUES('BigScreen', DATEADD(mm, @i, '3/11/1927'), DATEPART(ms, GETDATE()) + (@i + 13))
      INSERT INTO SalesHistory(Product, SaleDate, SalePrice)
VALUES('PoolTable', DATEADD(mm, @i, '3/11/1908'), DATEPART(ms, GETDATE()) + (@i + 29))
      SET @i = @i + 1

Once I have my data loaded, I can use the system stored procedure sp_spaceused to see how large my database is. After loading the above data, my database occupies approximately 21 megabytes on disk.

EXECUTE sp_spaceused

Execute the statement below to enable the capture of the time it takes to parse, compile, and execute TSQL statements. I will use this information to compare the time involved to take a normal SQL Server backup against a compressed SQL Server backup.


Squashing things down

The following statement uses the WITH COMPRESSION option of the BACKUP DATABASE command to create a compressed backup on the C drive on my SQL Server computer. The compressed backups takes up approximately 463 KB on disk.

TO DISK = 'C:\SalesHistoryCompressed.bak'

Note: You can also enable backup compression as the default at the server level using sp_configure and the ‘backup compression default’ option.

From the results of my STATISTICS TIME statement, I can see that the compressed backup statement took about 0.733 seconds

Moving data to Drive C

The statement below also creates a database backup on my C drive, but leaves out the compression option. This backup takes up approximately 2700 KB on disk.

TO DISK = 'C:\SalesHistoryNOTCompressed.bak'

From the results of my STATISTICS TIME statement, I see that the normal backup statement took approximately 0.618 seconds to complete.

The following statement can be used to examine the backup size information contained in the backupset system view in the msdb database.

SELECT TOP(2) compressed_backup_size, backup_size,
CompressionPercentage = 1-(compressed_backup_size/backup_size)
FROM msdb..backupset
WHERE database_name = 'SalesHistory'
ORDER BY backup_set_id DESC

There is a trade-off

You can see from the information gathered above that there is a trade-off when creating compressed database backups in SQL Server 2008. The backup size is significantly smaller, but it comes at the cost of taking more time and CPU cycles to complete the backup. These differences may be more pronounced as the size of the backups become larger. The only way to know for sure if using backup compression is the right choice for you and your organization is to try it out with your data sets.