Teach your Access users to be wary of AutoNumbered primary keys

Too often, end users don't realize the importance of choosing the right primary keys for their Access tables. This article outlines the potential problems that can occur when using an AutoNumbered primary key.

As an active Microsoft Access developer, I’ve got a good view of one side of the database development environment. But as an active Microsoft Access instructor, I also get to see what the rest of the nontechnical developer world is doing with Access. It’s this latter view of Access that has revealed how common it is for nontechnical Access users to incorrectly use an AutoNumber field type as a table’s primary key.

Here is why it’s so important to steer end users away from this dangerous practice.

Ignorance is rarely bliss when it comes to Access
Most business users, or even junior developers, do not have an in-depth understanding as to what a primary key is, let alone the ramifications of an ill-conceived primary key selection.

When a user saves a new table in Access, a dialog box pops up asking if the user wants to create a primary key if one doesn’t exist. The message says, “Although a primary key isn’t required, it’s highly recommended. A table must have a primary key for you to create a relationship between this table and other tables in the database. Do you want to create a primary key now?”

Many of the users I’ve talked to in introductory, intermediate, and sometimes even advanced Access classes usually say, “I saw the message and I figured, what the heck, seems like a good thing to do.” So, while they are actually taking the right step in creating a primary key, they don’t know how to correctly complete the task.

A simple plan
Poorly designed applications are usually more cumbersome and costly to maintain than those with the benefit of good initial planning.

Well-designed databases are generally characterized by a group of tables storing related data that is joined together through keys. For example, a database that stores information on customers and their related orders would likely have two tables: Customers and Orders. The Orders table would not contain any information about an order’s related customer (address, phone number, and so forth). Instead, it would contain the key that identifies the row containing the customer’s information in the Customers table.

AutoNumber: A double-edged sword
When choosing a key, it's generally a bad idea to choose a field that can be edited by a user. Doing so forces you either to restrict the user from editing the field after the record’s creation or to provide a way of detecting and correcting key collisions. If you restrict the user from editing the field, you may discover that your application isn’t flexible enough. The second choice is problematic as well; providing for the detection and correction of key collisions can be too complicated, seriously hindering your application's performance.

The AutoNumber datatype offers a handy solution to this problem but not without making your application more vulnerable to failure. On the positive side, using the AutoNumber datatype provides a field that will give you a unique value for every record, while also paving the way for establishing relationships between multiple tables. The negative side is that the application stands a much better chance of failing if the AutoNumbered values become corrupt.

Recovering from corrupt AutoNumbered values
A single-table database with an AutoNumbered primary key isn't at serious risk even if the key becomes corrupt. For example, if you rely on the AutoNumber data to double as your customer number, then it would be problematic but not impossible to rekey the values from a paper report or your file folders, allowing you to save the bulk of your data.

However, if you have a multiple-table database with relationships between a table with an AutoNumber Primary Key (Customers) and a corresponding Long Integer field acting as the Foreign Key in another table (Orders), you could face a much worse scenario.

Figure A
The Customers table has a relationship to the Orders table via the CustomerID field, which is the Customers table's primary key.

Let’s say your Customers table has an AutoNumbered primary key and your Orders table has a relationship back to the Customers table, as shown in Figure A. If the primary key in the Customers table goes bad, you will have no direct way of reestablishing the relationship to the Orders table. This assumes that your sole connection from the Orders table back to the Customers table is the primary key. So what can you do if you still want to use the AutoNumber datatype as a primary key?

The answer is that you need a way to rebuild a key using other fields in the table in the event that the AutoNumber does fail. If you had to start over with an AutoNumber field in Access after some type of disaster, it wouldn’t be of much use to you. This is because gaps in the sequence from deleted records would no longer exist, so your foreign keys in the other tables would no longer correctly match back to the corrupt table. You would be forced to develop a new keying methodology.

If you had stored the client phone number, for example, in the Orders table, you could rebuild the relationship to a degree. Database purists may bristle at this idea because it violates the principle of not duplicating data unnecessarily. However, in the practical world, you have to balance safety with design purity. A little extra baggage in a table in order to increase your recoverability in the event of a disaster is usually a pretty good trade-off.

With these observations, you should now be able to help your less technical Access developers better understand the ramifications of using an AutoNumber field as their primary key. The end result is something any Access user would be thrilled to have: better-designed, less vulnerable Access applications.

Editor's Picks