Data Management

What to consider when moving from MySQL to PostgreSQL

Upgrading to a new product sometimes compels you to completely rethink how you approach a problem. That's what happened when one member migrated from MySQL to PostgreSQL. See what he discovered when he made the switch.


MySQL might be the easiest relational database to learn. It might also be the fastest for certain basic tasks, such as serving up dynamic Web pages. It’s even easy to install and administer. However, its strengths are largely tied to its limitations, and support for many high-end database features is scarce. So I decided to take a look at PostgreSQL.

For the first several months after I moved from MySQL to PostgreSQL, I thought it was just a bigger, more complex database that did the same things as MySQL. It took me a while to really realize how great the additional features are.

Here, we will look at my migration from that database to PostgreSQL. I’ll share some of the things I learned and explain what you should consider before making a similar upgrade.

Some differences between PostgreSQL and MySQL
In general, PostgreSQL makes a strong effort to conform to existing database standards—which isn’t always the case with MySQL. If you’re coming from a MySQL or Microsoft Access background, some of the changes (such as not using double quotes to quote string values) can seem strange. Here are some comparisons:
  • MySQL uses nonstandard # to begin a comment line; PostgreSQL uses — (double dash). This is the ANSI standard, and both databases understand it.
  • MySQL uses single or double quote marks to quote values (e.g., WHERE name = “John”). This is not the ANSI standard for databases. PostgreSQL uses only single quote marks (e.g., WHERE name = ‘John’). Double quote marks are used to quote system identifiers, such as field names and table names (e.g., HERE “last name” = ‘Smith’).
  • MySQL uses the ` (accent mark or backtick) to quote system identifiers, which is decidedly nonstandard.
  • PostgreSQL is case-sensitive for string comparisons. The field “Smith” is not the same as the field “smith”. This is a big change for many users from MySQL and other small database systems, like Microsoft Access. In PostgreSQL, you can:
    —Use the correct case in your query (e.g., WHERE lname= ‘Smith’)
    —Use a conversion function, like lower() to search (e.g, WHERE lower(lname)= ‘smith’)
    —Use a case-insensitive operator: ILIKE or *~
  • Database, table, field, and columns names in PostgreSQL are case-independent, unless you created them with double quote marks around their name, in which case they are case-sensitive. In MySQL, table names can be case-sensitive or not, depending on which operating system you are using.
  • PostgreSQL and MySQL seem to differ most in handling of dates and in the names of functions that handle dates.
  • MySQL uses C language operators for logic (e.g., ‘foo’ || ‘bar’ means ‘foo’ OR ‘bar’; ‘foo’ && ‘bar’ means ‘foo’ AND ‘bar’). This might be marginally helpful for C programmers but violates database standards and rules in a significant way. PostgreSQL, following the standard, uses || for string concatenation (e.g., ‘foo’ || ‘bar’ = ‘foobar’).

There are other differences between MySQL and PostgreSQL, such as the names of functions for finding the current user. MySQL has a tool called Crash-Me that can be useful in identifying these differences. Ostensibly, Crash-Me is a comparison tool for databases; however, it tends to seriously downplay MySQL’s deficiencies and isn’t very objective in what it lists. The entire idea of having procedural languages (a very important feature for many users!) is relegated to a single line near the bottom of the document, while the fact that MySQL allows you to use || for logical OR (definitely nonstandard) is listed well before this as a feature. Be careful about Crash-Me’s interpretations.

The larger picture
To me, the real question isn’t just, “How do I do this MySQL thing in PostgreSQL?” but rather, “Is there a much better way to think about this, which MySQL doesn’t even support?” For example, imagine you’re building a dynamic Web site for human resources. You want to include every current senior staff member's name, some information about them, and a list of their goals.

With MySQL, you'd do something like this:
(This is generic pseudo-code; it would easily translate to PHP, Zope, EmbPerl, etc.)
<in sql= “SELECT staffid, firstname, lastname FROM Staff
          WHERE datefired ISNULL and seniorstaff = TRUE”>
    <h1>$firstname $lastname</h1>

      <ul>
          <in sql= “SELECT goalinfo FROM Goals WHERE staffid=$staffed”>
          <li>$goalinfo
          </in>
      </ul>

</in>

It works fine, and you can easily translate this to PostgreSQL. But would you want to? PostgreSQL has many features MySQL doesn’t, such as:
  • Views
  • Procedural languages
  • Triggers
  • Customizable aggregates
  • Transactions

So, for instance, rather than coding in the Web front end the logic above (is-not-fired and is-senior-staff), in PostgreSQL, I'd make a VIEW of all staff for which we want to show goals:
CREATE VIEW staff_having_goals AS
SELECT staffid, firstname || lastname as fullname
FROM Staff
WHERE datefired ISNULL and seniorstaff = TRUE
ORDER BY lastname, firstname


Now my Web programming doesn’t have to worry about the lower-level concerns. Imagine that this same list of people and goals appeared dozens of times on your site—I’ve moved from having it scattered in many places to having it encapsulated in one place.

PostgreSQL also allows you to use procedural languages (perl, tcl, python [alpha], and an Oracle-alike, PL/pgSQL) to create functions in your database (and even nonsysadmins can use them, as the functions fit in the PostgreSQL security model). Yes, MySQL has user functions, which, last time I checked, had to be written in C and linked into the database. A nice feature, to be sure, but very different from having high-level procedural languages usable without root privileges!

We might use these procedural languages to create lists and handle database events (if a record is added here, automatically track this there, and so on). For example, you could have a function to calculate a staff member’s hourly compensation from his or her salary, which IMHO, should be a database function, not a function coded for every Web project or front-end project you have.

PostgreSQL also has transactions, which can remove some of the hairy if-an-error-happened-back-out-all-database-work code. (MySQL, to its credit, has grafted in transactions in their new MaxSQL distribution.)

In addition, PostgreSQL supports many standard parts of SQL that MySQL doesn’t, such as subqueries, unions, and intersections. While you can often program around these, either with more SQL or more logic in the front-end, the best (fastest, more portable, most abstracted) solution is to integrate this thinking into your query writing and database design.

Summary
As you can see, the things that work differently in PostgreSQL are fairly small and can generally be handled without too much pain—especially since you can easily create PostgreSQL user functions that mimic any from MySQL. I’d start with the five features discussed above (views, procedural languages, triggers, customizable aggregates, transactions) and make sure that you understand exactly what they are, how to use them, and how wonderful they are. The key is to learn what features PostgreSQL offers—and then figure out what they can do for you!

Are you changing databases?
Do you use either MySQL or PostgreSQL? What has your experience been with these products? Send us an e-mail with your thoughts and experiences or post a comment below.

 

Editor's Picks