Big Data

Why some of the fastest growing databases are also the most experimental

Everyone has heard about MongoDB and Cassandra, but what other databases are making big gains against Oracle and Microsoft?

Image: iStockphoto/tonefotografia

It's no surprise that NoSQL databases like MongoDB, Apache Cassandra, and Redis have cracked the top 10 most popular databases worldwide, and continue to eat away at the decades-long dominance of relational databases. It's also not surprising that database offerings from the public cloud giants have also quickly catapulted into the mainstream.

What is surprising, or at least revealing, however, is to look at the databases that may still elude the spotlight yet are making dramatic climbs up the database popularity charts. Looking at the biggest movers on the DB-Engines database rankings shows robust interest in graph and document databases, not to mention anything that Amazon or Microsoft ship.

Relative stasis

DB-Engines measures database popularity across an amalgam of different factors: Job postings, discussion thread mentions like Stack Overflow, Google searches, and more. It's an imperfect, yet still potent way of measuring database popularity and, not surprisingly, the higher in the rankings one goes, the harder it is to progress against alternative databases.

SEE: NoSQL keeps rising, but relational databases still dominate big data (TechRepublic)

As such, within the top 10 databases we see some upward movement (MongoDB, Cassandra, and SQLite), but only single-digit movement. From #11 to #30, the movement can be more pronounced (MariaDB jumping five places and AWS DynamoDB climbing four), but the bigger worry is databases that have sat stagnant over the year (Couchbase, HBase), or fallen multiple spots (Memecached, MarkLogic).

Even so, among these top 30 databases, the past year saw limited swings in popularity. Instead, it has been a matter of slowly plodding up or down the popularity rankings. Outside the top 30, however, it's a very different story.

The wild west of databases

The most vibrant experimentation in the database market happens on these outskirts of mainstream database technology. While the dreams of a polyglot persistence—that mythical realm where developers choose whichever database is deemed fittest for a particular purpose—have foundered on the rocks of reality (it turns out someone has to support all those databases), it's still the case that today's niche databases can go mainstream tomorrow.

SEE: Developers are calling it quits on polyglot programming (TechRepublic)

Hence, databases 31-100 are where serious user experimentation is happening. Here are the databases with the biggest upward swings in popularity this past year:

  • Impala (from #45 to #37) - Cloudera's massively parallel processing (MPP) SQL query engine for data stored in Hadoop.
  • OrientDB (from #49 to #41) - A distributed graph database that combines some elements of document databases.
  • Google BigQuery (from #51 to #42) - Google's data warehouse.
  • Titan (from #59 to #44) - A distributed graph database increasingly paired with Cassandra.
  • RethinkDB (from #70 to #46) - A distributed document-oriented database.
  • Aerospike (from #68 to #47) - A high-performance key-value store.
  • InfluxDB (from #78 to #48) - A database geared toward IoT time series data.
  • MemSQL (from #93 to #69) - A distributed in-memory database that lets you process transactions and run analytics in real-time, using SQL.
  • Microsoft Azure DocumentDB (from #99 to #73) - Microsoft's cloud-based document database.
  • PouchDB (from #101 to #77) - A document database that lives inside the browser, obviating the need for queries over a network and helpful for constrained bandwidth scenarios.
  • Amazon Aurora (from #119 to #83) - Amazon's MySQL-compatible improvement on MySQL.
  • Kdb+ (from #110 to #95) - A multi-model database.

What can we glean from these results? First off, graph databases are clearly on the rise (OrientDB, Titan). So are document databases (PouchDB, Microsoft Azure DocumentDB, RethinkDB, OrientDB). Not surprisingly, the cloud is driving the popularity of some of these databases (Microsoft Azure DocumentDB, Amazon Aurora), and all of them reflect the industry's need for new ways to manage big data at scale.

Will any of these break through into the top 10? By definition, there's not room for all of them, and it's also clear that there's a lot of friction involved in displacing the general purpose workhorses that currently sit atop the database heap (Oracle, SQL Server, MySQL, PostgreSQL, etc.). Still, it would surprise me if we didn't see Amazon Aurora—the fastest growing AWS service in the company's history—keep bounding up the rankings and eventually settle into the top 10.

That's the principal database to watch, but there are plenty of others, with the ones mentioned above among the strongest contenders.

Also see

About Matt Asay

Matt Asay is a veteran technology columnist who has written for CNET, ReadWrite, and other tech media. Asay has also held a variety of executive roles with leading mobile and big data software companies.

Editor's Picks

Free Newsletters, In your Inbox