Amazon just open sourced an easier path to PostgreSQL

Commentary: Companies have been trying to shift to open source databases like PostgreSQL for decades. A new open source project may make that much easier.

AWS re:Invent 2020: Discussing the latest announcements from the conference

For as long as databases have existed, people have been trying to migrate to something new. There are many reasons for this, but those reasons don't change the reality that migrating databases is hard.

SEE: Meet the hackers who earn millions for saving the web, one bug at a time (cover story PDF) (TechRepublic)

Breaking up with your database turns out to be exceptionally hard. As former MongoDB (now Google Cloud) executive Kelly Stirman once told me in an interview, "Databases are one of the stickiest products out there. Switching costs are very high. Database replatforming is viewed as one of the riskiest propositions in all of IT. No matter what, for existing workloads, transition will be slow."

This hasn't stopped people from trying. In fact, 2020 seems to be the year when database migration services (DMS) hit overdrive. Earlier this year Microsoft introduced its Azure Database Migration Service to help customers move database workloads to Azure. For its part, MongoDB partnered with Informatica to guide enterprises from Oracle to MongoDB. And just a few weeks ago, Google announced its Database Migration Service, a serverless offering designed to move on-premises database workloads to Google's cloud. 

SEE: Top cloud providers in 2020: AWS, Microsoft Azure, and Google Cloud, hybrid, SaaS players (TechRepublic download)

Although these and similar services may differ in how they migrate databases, they all focus on migration. They're also similar in that they're designed to migrate databases to their respective cloud services. 

AWS just announced something very different. 

Do you speak Babelfish?

As AWS CEO Andy Jassy revealed in his Tuesday keynote at the company's re:Invent conference, AWS is open sourcing Babelfish for PostgreSQL, an Apache 2.0-licensed project that acts as a new translation layer for PostgreSQL. What does that mean? It means that PostgreSQL can now understand commands from applications written for Microsoft SQL Server without the developer having to change database schema, libraries, or SQL statements. For all intents and purposes, the application thinks it's talking to SQL Server, but it's now talking to PostgreSQL. 

This is different from traditional migration services, which involve converting the database schema from one database to another, then taking data from the first database and loading it into the target database, with any necessary ETL (extract, transform, and load) steps thrown in to complete the process. While a DMS is a very useful thing, they tend to be error prone and still require quite a bit of work even after the migration (e.g., a developer still needs to change the application to use the target database's drivers, and often they'll need to rewrite internal application code to make it all work).

In contrast, Babelfish does none of that. Babelfish lets you load your data into PostgreSQL with no conversion and lets your application think it's still talking to SQL Server. 

So that's cool, but what I like best is that it's open source. This is important for two reasons.

Open sourcing a path to PostgreSQL

First, Babelfish enables a company to move to PostgreSQL and run that database...anywhere. On-premises? Yep. On AWS? Of course. On Azure/GCP/Alibaba/cloud-of-your-choice? Absolutely. In this way, Babelfish could actually be the heart of a multicloud strategy, at least for database workloads. Build on PostgreSQL and run those workloads anywhere you want. Yes, AWS would prefer that enterprises will use Babelfish to run more PostgreSQL applications on AWS, but because it's open source, the user decides where they want to run their PostgreSQL workloads.

And, let's be clear, developers love to run PostgreSQL however and whenever they can. PostgreSQL has always been a popular choice with developers, but that popularity has exploded over the last 10 years. Every year for the past several years developers have crowned it the relational database they most love. SQL Server is a great database but, given the chance to run more PostgreSQL…? I'm betting developers will be jump at that chance.

The second reason Babelfish's open source approach is important is that it enables a community to steer the project. A host of individuals and companies have spent decades trying to help companies migrate from SQL Server to PostgreSQL. PostgreSQL community member Ian Harding, for example, started working on this roughly 20 years ago. In this process, they have learned a great deal about SQL Server and PostgreSQL, and the nuances of their differing semantics. AWS says that at launch, Babelfish will cover the majority of common application scenarios. But given the size of the SQL Server surface area, Babelfish needs the expertise of an open source community to cover the long tail of SQL Server functionality. 

Gartner analyst Merv Adrian once told me, "The greatest force in legacy databases is inertia." This inertia derives from investments in data models, supply chains that are deeply integrated with a particular database, or simply fear of failure. With open source Babelfish, hopefully some of that inertia can be removed, paving the path to more PostgreSQL.

Disclosure: I work for AWS but the views expressed herein are mine. That said, for the Babelfish for PostgreSQL open source launch, I was an advisor on community engagement.

Also see

open source - pc board

Image: uriz, Getty Images/iStockphoto