As a developer or software architect in today’s market, you may be called upon to design a database, even if you’ve never done it before. Companies are cutting costs everywhere, but skimping on the data modeling process might mean your database needs frequent modifications or a redesign after deployment.
The concepts presented here will help you create a data model that fits your needs, even if you don’t have the luxury of performing every step.
Building from the top down
In a perfect world, there are three stages to developing a database: planning, design, and deployment. In the real world, especially when a database will be used for a single application, up-front planning receives scant attention, and most of the work is lumped into the last two phases, design and deployment.
If possible, extend the design stage to include the three concepts listed below before deployment. The data your application depends on will take on a logical quality that promotes reuse and transcends any one application.
The Conceptual Data Model
An effective data model must consider business processes. After reviewing the business rules that apply to the relevant information, you create a Conceptual Data Model (CDM). This usually includes an Entity Relationship Diagram (ERD), a high-level representation of business information and its sources and endpoints and the events that occur in all steps in between.
CDMs are created without consideration of database concepts. Business practice, not information similarities, defines relationships. This seems irrelevant to how information is stored and managed, but it’s an important first step to understanding how stored data is used.
There are various schools of thought about CDM information, and the definition of this stage will vary within different methodologies. For example, CDM attributes are usually associated with data, and sometimes keys and other database concepts are included as attributes.
A business or systems analyst normally creates the CDM and applies it to both application and data design. When you’re creating a data model, check to see if this step was performed as part of the project’s due diligence. Reviewing this information will help you complete the next stage in data modeling.
The Logical Data Model
The logical design of a database involves taking the business information discovered in a CDM and translating it into an empirical layout of the data, usually in a relational style. Experienced data architects often omit this stage, even though data flow diagrams taken from a business point of view can be complicated and may require an intermediary interpretation. The logical design facilitates the final stage before you deploy your database.
In the previous step, you identified information and assigned attributes. In a Logical Data Model (LDM), that information is further processed by identifying data field names in a normalized layout and including previously omitted attributes like data types. Identify indexes based on frequency of data access. Finally, you must create keys and identifiers that show associative entries.
Normally a data analyst performs this step. At the very least, modify the entity relationship diagram created during initial discovery to include data relationships. Table structures should start to become apparent. Once you’ve completed this stage, the data model is ready to enter the final planning stage.
The Physical Data Model
The Physical Data Model (PDM) is the final stage of planning. It describes an absolute design of the database implementation. A data architect, or sometimes a database administrator, completes this stage.
The final layout of tables, columns, indexes, and constraints is defined, hopefully with the help of a schema diagram. You’ll adapt the logical model derived from the business concepts to this model for database deployment.
This phase requires technical expertise since data optimization, key constraints, and other such concepts must be considered. Once you’ve defined the physical model, it can be used to generate Database Description Language (DDL) scripts for creating the database.
At this point, the data model takes on a form far removed from the original business language, but it’s been built in such a way as to incorporate the initial intentions.
Data modeling resources
These three design phases describe the general progression of any data model. There are various methodologies for performing data analysis and deployment. The most widely used of these define processes that complete the phases presented above, generally with their own twist on achieving the described result.
Below I’ve included links to useful information and data modeling resources. I recommend becoming familiar with them and deciding which will work best in the circumstances relevant to your deployment and work situation.
- Resource Description Framework (RDF) Model Theory: This is a working draft, supported by the W3C, that attempts to provide a mathematical framework for describing resources and their relationships. This model theory relies heavily on the Hayes-Menzel Knowledge Interchange Format (KIF).
- The Zachman Institute for Framework Advancement (ZIFA): This relatively new technique provides a description of data modeling for translating business needs into data architecture.
- InfoAdvisors Data Model Group: Professional community and message board focused on data modeling.
- Applied Information Science Web site: Last updated in 2000, but offers some good background information on data modeling theory.
- Pal’s Linux RDBMS Library: Don’t let the name fool you—this page has numerous pieces by the well-respected data model theorist C. J. Date and the first part of E. F. Codd’s famous 1970 article, A Relational Model of Data for Large Shared Data Banks. Codd is the father of relational databases.
A logical approach to database design
In designing a data model, it’s important to understand the reasoning behind the information in order to create a source that is useful beyond a single application deployment and to avoid reworking your design once new data is introduced. The phased approach introduced in this article will help you create an intelligent database design.
What role do you play in database design at your organization?
Role definition and design progression vary from company to company. Let the community know how things have shaken out at your place of work by joining the discussion below or sending us an e-mail.