If an enterprise’s data is its lifeblood, then the database design can be the most important part of an application. Volumes have been written on this topic, and entire college degrees have been built around it. However, as has been said time and time again here on Builder.com, there’s no teacher like experience.
I’ll get the show started by listing my five favorite tips and giving a brief explanation of the rationale behind each one. If you have a general design tip (try not to be specific to a particular system) you’d like to share, post it to the discussion. Where possible, explain the reasoning behind your tip by including an example or anecdote.
#1: Use meaningful field names
I once worked on a project I inherited from another programmer who liked to name fields using the name of the onscreen control that displayed the data from that field. That’s all well and good, but unfortunately, she also liked to name her controls using some strange convention that combined Hungarian notation with the order in which she added the controls to the UI: cbo1, txt2, txt2_b, and so on.
Unless you are using a system that restricts you to short field names, make them as descriptive as possible—within reason, of course. It’s possible to go overboard with this; Customer_Shipping_Address_Street_Line_1 is very descriptive and meaningful, but no one would want to have to type it more than once.
#2: Do your homework
Not only should you research your business needs when designing a new database, you should check out the existing system, as well. Few database projects are built from scratch; there is almost always an existing system (maybe not computerized) that the organization is using to fulfill its needs.
Obviously, the existing system is not perfect; otherwise, you wouldn’t be building a new one. But by studying it, you may discover nuances that you would otherwise have missed had you ignored it. If nothing else, examining the existing system is usually good for a chuckle or two.
#3: Take the user’s keys away
When deciding which field or fields to use as keys in a table, always consider the fields that users will be editing. It’s usually a bad idea to choose a user-editable field as a key. Doing so forces you to take one of these two actions:
- Restrict the user from editing the field after the record’s creation. If you do so, you may discover that your application isn’t flexible enough when business requirements suddenly change, and users need to edit that uneditable field. What happens when a user makes a mistake in data entry and doesn’t notice until the record is saved? Delete and re-create? What if the record isn’t re-creatable; suppose the customer left?
- Provide some way of detecting and correcting key collisions. Usually, this can be done with some effort, but it is expensive in terms of performance. Also, a key correction may wind up being possible only from outside the data layer, forcing you to break the isolation between your data and business/UI layers.
The underlying maxim here is this: Make your design fit the user, don’t make the user fit the design.
#4: Don’t over-normalize
For those unfamiliar with the term, normalization helps eliminate the redundancy of data in a database by ensuring that all fields in a table are atomic. There are several forms of normalization, but the Third Normal Form (3NF) is generally regarded as providing the best compromise between performance, extensibility, and data integrity. Briefly, 3NF states that:
- Each value in a table is to be represented once and only once.
- Each row in a table should be uniquely identifiable. (It should have a unique key.)
- No nonkey information that relies upon another key should be stored in the table.
Databases in 3NF are characterized by a group of tables storing related data that is joined together through keys. For example, a 3NF database for storing customers and their related orders would likely have two tables: Customer and Order. The Order table would not contain any information about an order’s related customer. Instead, it would store the key that identifies the row containing the customer’s information in the Customer table.
Higher levels of normalization exist, but is “more normal” necessarily better? Not always. In fact, for some projects, even 3NF may introduce too much complexity into the database to be worth the rewards.
#5: Build in a fudge factor
Not much is worse than discovering, or being informed, that your “finished” database is missing a field for a crucial piece of information. At one company I worked for, this was such a common occurrence that we began referring to “databases freezes” as “database slushes.”
So, unless table size is a serious concern, always build in a fudge factor to allow for new fields, even if this means just adding a couple of extra text fields to the end.
It’s your turn
Now that you’ve seen my tips, it’s time to share some of your own. Post your general database design tips to the discussion or just e-mail them to the editors.