Data Management

Building a data values dictionary with SQL

When a client asked a developer to construct a data values dictionary, the client was actually referring to a list of the distinct values in various columns and tables. Check out the basic SQL that was developed to fulfill this client's request.

TechRepublic's free SQL Server newsletter, delivered each Tuesday, contains hands-on tips that will help you become more adept with this powerful relational database management system. Automatically subscribe today!

When a client asked one of my colleagues to construct a data values dictionary, the client didn't mean what developers think of when they hear that term. He was referring to a list of the distinct values in various columns and tables. I'll present the basic SQL I came up with (after overcoming a few hurdles) to solve this problem.

First, let's talk about the requirements. You may not have an interest in knowing the distinct values for all columns. Typically, the "interesting" columns might be items such as ZipCode or ProductID in the SalesDetails table.

Obtaining the values is the easy part; it simply requires a SELECT DISTINCT. But since I need something a little more powerful and useful, I began by building a table called DataValuesDictionary, which you can see in Listing A.

The SELECT DISTINCT part of the clause will supply only the distinct values—you still need to populate the other columns. The simplest way to do this is to include these values directly in the SELECT statement, like this:

SELECT DISTINCT 'Customers' as TableName, 'CompanyName' as ColumnName, 
'Varchar(100)' as DataType, CompanyName as DataValue FROM Customers

This is fine for one table and one or two columns but, in order to go further, I turned the SQL above into a stored procedure. This required a little footwork, primarily to get the single-quotes working correctly. The procedure is offered in Listing B.

You can see the fancy footwork in the assembly of the SQL statement. The line beginning "set @s" must wrap the parameters in single quotes and deal correctly with the commas between the column names.

Note: For brevity, I did not include the code required to convert all column types to varchar. A DateTime column will work, but a Money column won't. In the unlikely event that you really want to add such a column's values to the dictionary, add code to test the DataType parameter and a block that CASTs or CONVERTs the data type to varchar.

Now that you have a data values dictionary, there are a number of reporting tools that will let you deliver an attractive printout to clients.

Editor's Picks