Data Management

Use Perl DBI to build portable scripts for database access

Perl DBI is a must-have for your arsenal. This article provides an introduction and sample code for connecting to a database with Perl.

Database-independent applications make it easy to change and upgrade your database or use different brands on remote or local hosts without extensively changing your code. The key to creating such apps is the Perl5 Database Interface (DBI) module. Read on to learn more about this module and to work through several examples.

How DBI can do what it does
The DBI module is a generic database interface that provides an API that enables scripts to use database drivers (DBD) produced by various database manufacturers. DBI comes bundled with a few initial drivers, such as ODBC and Proxy (used for client-server connections), and an example for driver developers. However, I’ve found that the included drivers vary on different platforms, and some may need to be reinstalled or updated depending on your Perl5 distribution.

To obtain DBDs not included with DBI, contact your database manufacturer or look in the downloads section of its Web site. DBI includes support for several proprietary databases, including Oracle, Sybase, Informix, and SQL Server, and a number of open source servers, as would be expected. You can even use the DBI module to access files in a variety of formats.

Use your normal method of module installation to include support files for your database. You can find a number of drivers at CPAN, but I recommend looking on manufacturers’ sites for the most recent versions.

Getting started
If you maintain your own server, it's helpful to know how to determine what drivers are available on your system. First, use your modules manager to find out if DBI is installed; if it isn't, get it from CPAN.

Once you’re set, use the DBI class methods available_drivers() and data_sources() to find out what drivers and databases currently exist. The simple script in Listing A should suffice.

After you run the script, pay attention to the output. If you’ve already installed the appropriate DBD and your database is running, you’ll see output similar to:
Driver name: mysql
       Data Source -> DBI:mysql:mysql
       Data Source -> DBI:mysql:test

In this output, two MySQL databases exist, “mysql” and “test.” Any additional databases you have created will also be listed. DBI will need the case-sensitive driver name shown in your output to create a connection to your database.

If the driver you are expecting isn't listed, try reinstalling the appropriate DBD module or looking for a more recent or alternate version. Or you might just stick to the ODBC driver, available from CPAN. If the driver shows up but not the databases you’ve created, check to make sure your database server is running and accepting connections.

Your next course of action is to connect to your database and manipulate it. The sections below describe how to do this and provide examples along the way. If you’re a big-picture kind of person and would like to see a complete sample script, you can refer to Listing B.

Connecting to your database
DBI’s API allows you to use the same function set for any database with an appropriate driver. Manufacturers may have included additional functionality for you to use, but the core methods are the same. If you stick to these methods, the only difference in your script from one database system to the next will be in your connect method:
my $dbh = DBI->connect( “DBI:mysql:test”, “dbuser”, “password”,
       { RaiseError => 1, PrintError => 1, AutoCommit => 1 });

This method accepts four parameters and returns a database handler identifier. The datasource parameter is defined by “DBI:” followed by the driver name, then by the database instance, which can optionally include further information such as hostname and port. The next two parameters are your database’s username and password.

The fourth parameter is optional. It consists of a list of attributes that can be assigned to the handler and that will apply to all executions involving that handler. The statement above includes three commonly used attributes. The attribute, RaiseError, is set to 0 by default and acts as an “or die” statement, terminating the script when an error is encountered. PrintError is enabled by default and displays error text from DBI::err and DBI::errstr. AutoCommit is also enabled by default, and will issue a warning if the commit method is additionally used to execute database changes.

Once a connection is made, multiple statements can be passed to the database. When you’re through issuing queries, the connection is closed with the disconnect method:

This renders the database handler inactive and is generally used at the end of a script, after changes have been committed.

Issuing queries
There are two basic methods for issuing queries. If your statement does not require results to be returned, such as with an UPDATE, INSERT, or DELETE query, you can use the do method. It returns the number of affected rows instead of a statement handler, which I’ll describe shortly. The basic syntax for the do method is:
$rc = $dbh->do($statement);

where $statement is an SQL query. The do method executes the query immediately.

The other basic option is to use the execute method. This method requires the query to be “prepared” prior to execution. To prepare a statement, perform the following:
$sth = $dbh->prepare($statement);

The returned statement handler can then be executed as many times as needed, can offer additional information about the query, or can allow for use of other functionality such as a concept called placeholders. You can learn more about this from the Perl DBI manual page.

Once a statement has been prepared, you use the execute method to send it to the database:
$rv = $sth->execute;

At this point, if the AutoCommit attribute has been unset and your database supports transactions, you would issue a commit statement:
$rc = $dbh->commit;

The execute method can be called multiple times from the same statement handler; however, it’s a good idea to return results between executions. To do that, you use one of a number of available methods, such as fetchrow_array, fetchrow_arrayref, fetchrow_hashref, and fetchall_arrayref. Each method returns the data in a particular format—a list of field values, an array of field values, a reference to a hash of name-value pairs, or a reference to an array of references, respectively. Take the example below using fetchrow_hashref:
while ( my $href = $sth->fetchrow_hashref ) {
       print “Name:\t”.$href->{name}.”\n”;
       print “Phone:\t”.$href->{phone}.”\n”;

The fields “name” and “phone” appear in our database and are available as keys with the fetchrow_hashref method.

Once you have finished parsing the data that's returned, you can tell the statement handler that you’re through with the following statement, which releases the query from the database:

Other methods, such as selectrow_array, consolidate many of these actions. Be sure to review the DBI manual page for details on these and other advanced features.

Expanding your Perl repertoire
The basic functionality described above is only the beginning. The DBI module also includes features for binding values to placeholders in query prepare statements and caching result sets. For databases supporting transactions, there is also a rollback method. Database manufacturers each have additional DBI methods, defined within their DBD. You can find more information about those features from your database manufacturer’s documentation.

Using the Perl5 DBI module, you can build portable scripts that support multiple database sources. Once you’ve mastered this functionality, it will become a staple in your Perl bag of tricks.



Editor's Picks