Data Management

Perils of Adding fields to Database Tables

If you're not careful, when you add a field to a database table, you may cause all sorts of problems. Today, database consultant Tim Chapman looks at two such scenarios, and how you can plan for their occurrence.

 Adding a field to a database table is sometimes not as simple as it may seem.  If certain situations are not considered when adding fields, you may find yourself looking at some long nights fixing database code.  Today, database consultant Tim Chapman looks at two such scenarios, and how you can plan for their occurrence.

SELECT ALL

A SELECT * statement returns all field values from a table.  While this is not only a lazy practice, returning more fields than needed will lead to performance problems in your application.  In addition to potential performance problems, SELECT * statements can cause your application code to break when additional fields are added to your database table.  Consider the following example.  Database table Sales has four fields; SaleID, SalePrice, SaleDate, and Product.  A front end application calls the following stored procedure to acquire sales for the last two days.

CREATE PROCEDURE usp_GetSalesLastTwoDays

AS

SELECT * FROM Sales

WHERE SaleDate >= DATEADD(d, -2, GETDATE())

 

When this procedure is run, it returns all four fields from the Sales table.  If the front-end application is code in such a way so that it tries to handle only these four fields, returning an additional field may result in a broken front end.  While something like this should be caught through testing, little situations like this may end up leaking through to production code which would result in a "broken" application.

The most simple method to correct this situation is to not use a SELECT * statement.  If you always explicitly list the field names that you want your procedures to return then you're much more likely to avert these types of problems as well as enjoy some performance gains from not returning all fields in the SELECT statement.

Table Joins

Adding database fields to tables that participate in JOIN statements (probably most of your tables) can potentially cause problems.  These problems can arise when the name of the field added also exists as a field name in the joined table and table aliases are not used.

To make this a bit clearer, consider the following example.  The Products table has 3 fields; ProductID, ProductCode, and ProductDescription.  The Sales table has the following fields; SaleID, ProductID, SaleDate.  Most queries join these tables in the form of:

SELECT SaleID, SaleDate, ProductCode

FROM Products p

JOIN Sales s ON p.ProductID= s.ProductID

Because most queries are only concerned with the ProductCode from the Products table, a developer as decided to add the ProductCode field to the Sales table to avoid the lookup.  Adding the ProductCode to the Sales table will not only go against normalization rules (which is sometimes OK), but will immediately cause the query listed above to fail.

The reason is that no table aliases are used when listing the fields, so the optimizer will not know whether the query is specifying the ProductCode from the Products table or the Sales table.  To fix this problem, any statement that looks at the ProductCode from the JOIN of these two tables and which does not use table aliases must be rewritten.  To avoid this problem entirely, ALWAYS use table aliases when joining two or more tables.

Conclusion

Adding a field to a table should not cause your programs to crash.  The problems encountered here could be very easily avoided if you include these types of situations in your database programming standards and ensure they are enforced.  A little bit of looking forward can prevent many hours of reprogramming.

About

Tim Chapman is a SQL Server MVP, a database architect, and an administrator who works as an independent consultant in Raleigh, NC, and has more than nine years of IT experience.

0 comments

Editor's Picks