Using Hash Fields in SQL Server

From auditing data changes to capturing data for loading a data warehouse, hash fields have a ton of practical uses. In today's article, database consultant Tim Chapman shows how you can use built-in SQL Server functions to create your own hash fields.

 A hash field is a calculated numeric field based on the value(s) contained in one or more fields.  This calculated value is great for detecting data changes, which can be put to use in auditing strategies or datawarehouse ETL (extraction, transformation, loading) operations.

Creating a hash field in SQL Server is easy.  The database engine has a couple of built-in functions that can be used to generate the hashed value.  Two of these functions are CHECKSUM() and BINARY_CHECKSUM().  CHECKSUM() is intended to be used to build hash indexes while BINARY_CHECKSUM() is great for detected data changes.  I'll be using the BINARY_CHECKSUM funtion for the purposes of todays writing.

Using the hash field

In the following example I'll show you how you can take advantage of the BINARY_CHECKSUM() function.  First, I'll create the SalesHistory table and load some data into it.

CREATE TABLE [dbo].[SalesHistory]


                SaleID int IDENTITY(1,1),

                Product varchar(10) NULL,

                SaleDate datetime NULL,

                StatusID TINYINT NULL,

                SalePrice money NULL








SET @i = 1

WHILE (@i <=500)



                INSERT INTO [SalesHistory](Product, SaleDate, SalePrice, StatusID)

                VALUES ('Computer', DATEADD(ww, @i, '3/11/1919'),

                DATEPART(ms, GETDATE()) + (@i + 57), 1)


                INSERT INTO [SalesHistory](Product, SaleDate, SalePrice, StatusID)

                VALUES('BigScreen', DATEADD(ww, @i, '3/11/1927'),

                DATEPART(ms, GETDATE()) + (@i + 13),5)


                INSERT INTO [SalesHistory](Product, SaleDate, SalePrice, StatusID)

                VALUES('PoolTable', DATEADD(ww, @i, '3/11/1908'),

                DATEPART(ms, GETDATE()) + (@i + 29),8)


                SET @i = @i + 1





ALTER TABLE SalesHistory


Once my table is created, I can add a calculated field to the SalesHistory table using the BINARY_CHECKSUM() function.

ALTER TABLE SalesHistory

ADD HashField AS BINARY_CHECKSUM(SaleID, StatusID, Product)


The HashField I create above creates a checksum on the SaleID, StatusID, and Product fields.  The resulting value is a signed integer value based on the values in the SaleID, StatusID, and Product fields.  As the values in these fields the HashField value will change also.  I can then use this HashField value to keep track of changes to specific data fields, which makes it great for capturing changes for loading a data warehouse.

Capturing these changes can be as easy as comparing each hash value to a previously stored hash value to determine changed rows to as complicated to finding row data changes and then routines to find the changed values.

Regardless of how you use it, using hash values in SQL Server is a great tool.


By Tim Chapman

Tim Chapman is a SQL Server MVP, a database architect, and an administrator who works as an independent consultant in Raleigh, NC, and has more than nine years of IT experience.