Data Management

Derby: Large, powerful DBs come in small packages...

Last time we looked at a very recent convert to the Open Source universe, Terracotta. Today we'll look at a not-so recent convert, but one that's really shaken up the database space: Apache Derby.

Apache Debry started out its life as a Java database project called "JBMS" that was developed by a developed by a small start-up back in 1997 during the first "dot com" boom. That startup, called Cloudscape, was acquired by the Object Relational database vendor Informix, which in turn was acquired by IBM. IBM renamed the database after its original authors, Cloudscape and after adding a lot of functionality to it and using it in various projects, donated it to the Apache Software Foundation in 2004.

After a period of "incubation" where the Cloudscape code was tested, vetted, and all of its code cleared of any potential IP (Intellectual Property) encumbrances, it was renamed "Apache Derby" and released in 2005 under the very generous Apache Software License. This is all nice history, but what exactly is Derby? And, why is is interesting in terms of the movement of closed source software onto the Open Source stage?

Derby is a full SQL database package including the Full ANSI SQL language, transactions, concurrency, triggers and even online backups (a limited form of replication). What makes Derby pretty amazing is the ability to use it completely unmodified to implement a full SQL server and client system. It's not meant to be a massive multi-user system but it's a perfectly acceptable solution for testing large web applications before installing a big production DB like Sybase, Oracle or PostgeSQL; it could also be used as the middle-ware DB for system like network management applications. What really makes Derby stand out from other embedded databases is the fact that it is a pure-Java database that implements a full SQL engine that can be embedded into any Java program. Most other embedded database that support Java need at least some native, non-Java helper code in order to function. And Derby is small: About 2 MB. Let's think about that for a moment: A full SQL engine in about 2MB of compiled Java byte-code that can be put into anything. That "anything" can be a Web application, a J2EE server, a stand alone Application ... or something more portable, like a an application that lives inside a PDA or even a cell-phone.

A few years ago it would have been inconceivable to "embed" a SQL database engine into a phone or a PDA, but even the smallest of "smart phones" have over 64MB of RAM and/or flash these days; some (like the Tre650/680/700, some Nokia and others) sport SD or miniSD slots where the user can install muti-gigabyte memory cards. This makes access to information easier, it makes applications that can span the mobile and data center worlds, and raises the bar on what can be done in small form factor devices.

So, where is this being used?

Everywhere! Derby (which is also distributed by Sun Microsystems under the name JavaDB) is being used as an off-line persistance cache for AJAX applications that allows them to continue to be useful even when the users browser isn't connected to the Internet. Some applications use Derby/JavaDB as a local SQL database for storing data in their field that is thn automatically sync'd with an "upstream" database (like a corporate repository) then the mobile user connects back to their home network. Not only could this work from a PDA or Cell phone, but one could even put an application and its database on a "USB Key" type flash drive. The possibilities are endless.

This brings us to the question that everyone asks about companies who Open Source their proprietary code: Why?

IBM is a pretty smart company; they're not lacking in code they can market or people and companies who are in need of their services; and there's nothing explicitly stated in the Derby incubation docs over at the Apache Software Foundation about why Derby was donated, but given IBM's very generous contributions in both code and royalty-free access to IBM's patent portfolio over the last 5+ years to the Open Software movement, one can jump to the conclusion that IBM has realized that, as the old saw goes, "A Rising Tide Lifts All Boats." IBM donates code; talented developers dream up new applications IBM never thought of, new markets are born in which IBM can sell their wares. IBM has become a leading light in pushing the idea that not everything in the world needs to be protected like the crown jewels, and in fact, by opening up the corporate code and IP treasury often leads to huge returns later on.

Seems like a pretty smart strategy to me...


Apache Derby source: Apache Derby
Terracotta source: open Terracotta

Editor's Picks

Free Newsletters, In your Inbox