In an earlier tip, I showed a way to process a table that
stored multiple entities in the same table. Let’s take a step back. What should
you think about when designing such a table?
In theory, every entity you uncover during the design
process would result in a separate table. If your application deals with
employees, you’d create an Employees table; if those employees have managers,
there will be a Managers table.
In practice, however, when two entities are very similar it
can be more efficient to use a single table for both. The employee and manager
entities become “subtypes” of an expanded Employees table.
The main decision you have to make is what to do with the
differences in the subtypes. If the different subtypes require the same kind of data, just a different interpretation, you can use the same set of columns for all
subtypes. The other option is to create subtype-specific columns. Each approach
has its advantages and disadvantages.
When reusing columns, you will need to add another column to
the table to indicate the subtype of a given row. This is important for
interpreting the data in the overloaded columns.
No such indicator column is needed when separate columns are
used. An IS NULL test on any column unique to a subtype will indicate if the
row is of that subtype. A data value indicates it is that subtype; a NULL value
indicates it is some other subtype.
You can also model more than just two subtypes in a table,
but the same principles above apply to these situations as well. You will
probably wind up using a combination of both approaches, depending on the data.
Suppose we want to model both employees and managers in the
same table. In addition to the columns in common between the two entities,
employees have an hourly rate and managers have an annual salary. In addition,
managers have an assigned parking space number from 1-50, whereas employees
have a general parking pass number up to 9999.
Listing A shows a CREATE TABLE
statement in which the same two columns are used for both subtypes. Listing B
shows what the CREATE TABLE statement would look like if separate columns are
Notice that in Listing A, NOT NULL constraints can be used
on the columns to prevent missing data, whereas in Listing B they cannot. Also
notice that the CHECK constraints in Listing A must be applied at the table
level because they reference more than one column. Listing B’s CHECK
constraints can be applied at the column level because
they only deal with the current column’s contents. They’re simpler, too, but
there are more of them.
Miss a tip?
Check out the Oracle archive, and catch up on our most recent Oracle tips.
Subscribe to the Data Insider Newsletter
Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more. Delivered Mondays and Thursdays