When
properly implemented, a relational database can greatly enhance the
availability of data and information for an enterprise’s decision makers.
However, deploying a relational database on almost any scale requires a
thorough understanding of the fundamental concepts and rules that govern their
behavior.
Although my
SQL is one of the most popular relational database applications available, that
does not mean that it is immune from the complexities associated with the setup
of a relational database. In their book, Get It Done With MySQL 5,
Arthur Fuller and Peter Brawley describes the basics of using MySQL, including
building queries and using the various built in APIs. Chapter 1, Relational
Databases, available from TechRepublic, explains the fundamental
concepts underlying relational databases, including sets, tables, Codd‘s 12 Rules, normalization,
and the limits of the relational model.
In the
following interview, authors Arthur Fuller and Peter Brawley discuss common
problems enterprises must overcome when implementing a relational database and
the future of MySQL as an open source application.
Interview
[TechRepublic] Relational databases can be a very
powerful business tool if planned and deployed correctly. Unfortunately, all
too often the implementation of these systems is not as good as it could be. In
your experience, what are so many organizations missing when it comes to
planning their relational databases? Is it one common shortcoming, or does it vary with each enterprise?
[Arthur Fuller] As Fred Brooks said, every
significant program should be written twice: once to understand the problem and
once to solve it. The main thing organizations of all sizes are missing is
this: If the database is complex, you are guaranteed to get it wrong the first
time. But it’s hard for managers to accept this and factor it into their
schedule.
[Peter Brawley] One reason nothing’s as good as it
could be is that database designs and programs are like novels: development’s
not finished, it’s just abandoned at a point of diminished return.
There’s one
problem I often see: buying into the particularities of one system. It’s
entirely understandable; faced with multiple RDBMS implementations the organization
picks one and hangs on for dear life. SQL standards were supposed to fix that,
but could not. Every implementation has its limitations and weaknesses, and
eventually those weaknesses can make for big trouble if the organization hasn’t
stayed light on its feet.
[TechRepublic] In the chapter excerpt, you state a
principle for practical database design: “All primary keys shall be
meaningless.” Why is this so important?
[Arthur Fuller] To be fair about this, some data
modeling experts disagree. Their argument is that a meaningless primary key (PK)
models nothing in the real world. I agree with that statement, but in the
applications I have worked on it is unimportant. Suppose, for example, you’re
selling eggs by the dozen and crate, or nuts and bolts, or DVD movies. Even if
you could think of a way to individuate each instance, what good would it do?
Do you really need to know which chicken laid which egg, and which slot in the
egg carton that particular egg occupies?
Furthermore,
PKs that embed meaning must be invented by the users.
I have seen numerous databases where what seemed like good and clear rules for
creating PKs turn out over time to have
contradictions, ambiguities and inconsistencies. I have seen users sitting
there scratching their heads trying to invent a new PK and duplicating existing
keys again and again.
I tend
toward the maxim, “Anything the system can do, the system should do.”
PKs are one example.
[Peter Brawley]PKs have
a purely logical purpose: to numerically guarantee row uniqueness. No
real-world datum can make that guarantee. If you use national id numbers (for
example Social Security Numbers in the US & Canada), you have to deal with
transcription errors and millions of fakes, if you use the t-shirt company’s
shirt codes, the company will eventually have to revise its coding, and so on:
you wind up having to edit primary keys in parent and child tables, a no-no for
many reasons.
[TechRepublic] MySQL is an extremely popular
database application and is used by many enterprises in conjunction with their
Web sites. How would you rate MySQL in terms of capabilities and features? Is
MySQL best suited for particular applications? Is it not well-suited for
others?
[Arthur Fuller] Perhaps the most outstanding feature
of MySQL is its performance. Perhaps its weakest area is security. The MySQL
team has responded to demands for stored procedures and views. This is the most
important new development in several versions. Finally, you can’t ignore the
price. I wouldn’t recommend that a production database be set up without a
support contract, but even then the price is a small fraction of what Oracle or
Microsoft would charge.
[Peter Brawley] Open source, price, performance,
and the huge and extremely helpful user community are four reasons I think
MySQL will be around for a long time.
[TechRepublic] MySQL has a storied history and is
very popular, but are there other open source databases, perhaps more powerful
or having more features, on the horizon? Will MySQL be as popular 10 years from
now?
[Arthur Fuller] As Werner Heisenberg said,
“Prediction is difficult, particularly of the future.” Things change.
Sybase has released a free version of its database. Borland has done the same
with Interbase, and Computer Associates with Postgres. And who knows what revolutionary technology is
being built in some basement as we speak? MySQL will be around for years.
Whether it becomes the dominant database is as much a marketing question as it
is a technology question.
[Peter Brawley]Can’t add
anything to that.
More detail
Download the free book excerpt, Relational
Databases, for
a more comprehensive examination of the basic relational database principles
established by Dr. Edgar Frank Codd and for insight
on how you can put those axioms to practical use in your next database project.