This week is MongoDB World 2022, and for those of us who were around for MongoDB 2014 (like me), it’s fair to say that much has changed over the past eight years for the company and industry.
While I’m clearly biased – I worked for MongoDB in 2014, and after a few years with AWS and Adobe, I rejoined in mid-2021 – it’s interesting to see how the database market has evolved in less than a decade. But it’s perhaps just as interesting to see how some things have stayed the same.
SEE: Hiring kit: Database Engineer (TechRepublic Premium)
Not just relational data: A key shift in the database market
Back in June 2014, the top five most popular databases were exactly the same as June 2022: Oracle, MySQL, Microsoft SQL Server, PostgreSQL and MongoDB.
The difference is their relative popularity: PostgreSQL and MongoDB have been gaining in popularity relative to the relational incumbents. PostgreSQL, for example, has been gaining at Oracle’s expense, and today is roughly half the score of Oracle (measured in terms of things like job postings and LinkedIn profile mentions), versus roughly 16% as popular as Oracle back in 2014.
There are a number of reasons for this shift in the market. I’ve written before about PostgreSQL’s rise. Here I’m going to focus on MongoDB, and what it means.
In his opening keynote, MongoDB CEO Dev Ittycheria shared statistics that showed MongoDB has become mainstream data infrastructure for over 35,000 customers, from Fortune 500 companies to garage-dwelling startups alike.
Customers are one metric to measure adoption, but MongoDB’s imprint is even broader. Though downloads are no longer a primary metric for this predominantly cloud-focused company (60% of the company’s revenue derives from Atlas, its cloud service), in 2014 the company counted downloads in the tens of thousands. Today that number stands at 265 million, with more people downloading MongoDB’s community product in 2022 than in the first 11 years of MongoDB combined.
That’s a lot of adoption for a product that in 2014 still got web scaled eye rolls. The video was funny, though MongoDB never really struggled for scale. If anything, that video captured a feeling that relational databases could take care of most application requirements. As such, the challenge for MongoDB in 2014 was to convince developers to consider a world beyond relational data and tabular data structures.
MongoDB has always handled data relations just fine; it has handled them differently than a relational database. So, back then, the company accepted the NoSQL label, despite its problems (who wants to be defined by what they’re not?), because it helped developers think beyond tabular data structures.
Since that time, there has been an explosion in non-relational, or multi-model, databases. Today, DB-Engines includes nearly 400 databases, but less than half of them are relational databases. From document to time series to graph to columnar to key-value to [insert new database type here], the industry has kept using relational databases even as it has found a home for a wide variety of new databases.
As RedMonk analyst Steve O’Grady has written: “The era of a single category of general purpose databases gave way to a time of specialization, with database types selected based on workload and need.”
Back in 2018, Amazon CTO Werner Vogels captured this movement in a blog post:
“For decades, because the only database choice was a relational database, no matter the shape or function of the data in the application, the data was modeled as relational,” Vogels said. “Instead of the use case driving the requirements for the database, it was the other way around. The database was driving the data model for the application use case. Is a relational database purpose-built for a denormalized schema and to enforce referential integrity in the database? Absolutely, but the key point here is that not all application data models or use cases match the relational model.”
Vogels keynoted that first MongoDB World in 2014. To capitalize on customer interest, he then helped AWS to launch over a dozen new, “purpose-built” database services over that subsequent eight years.
The rise and rise of general purpose databases
More recently, we’ve seen the database market “revert to the mean,” as it were, with developers returning to general-purpose databases like MongoDB and PostgreSQL. The reason, explained O’Grady, is simple – or, rather, is about simplicity: “The overhead today of having to learn and interact with multiple databases has become more burden than boon.”
He continued that enterprises have pushed database vendors to augment capabilities because they don’t want “to context switch between different datastores, because they want the ability to perform things like analytics on a given in place dataset without having to migrate it, because they want to consolidate the sheer volume of vendors they deal with, or some combination of all of the above.”
SEE: Hiring Kit: Database Administrator (TechRepublic Premium)
In 2014, MongoDB helped to spark an industry trend toward specialization; in 2022, it’s part of a movement away from specialization. The irony is that MongoDB has never touted specialization, and has instead marketed itself as a general-purpose database from the start. Why? Because, as O’Grady explained, “general purpose” makes developers’ lives easier, and MongoDB has always focused on developer convenience.
Which is why some of the biggest news from MongoDB World 2022 shouldn’t be news at all: The company increasingly positions MongoDB as a developer data platform, not a database.
Databases become data platforms
Again, let’s look at this against an industry backdrop: A range of companies are trying to provide one-stop shops for data scientists, business analysts or other groups, and are therefore marketing data clouds and data platforms. What seems unique about MongoDB’s focus, by contrast, is the company’s focus on developers.
Even though MongoDB can credibly claim that its document model has significantly improved developer productivity, application requirements keep forcing developers to take on the uncomfortable responsibility of connecting a sprawling set of backend data systems, including search, real-time analytics and more. These services, in turn, require management like logging and alerting. Guess who has to stitch them all together? Developers. As MongoDB CTO Mark Porter quipped, this leaves developers with “more glue than model.”
Indeed, throughout the opening keynotes, MongoDB executives kept reiterating the company’s focus on developers. But now, instead of being about flexible schema or horizontal scale, the company touted an elegant developer experience that spans an increasingly broad set of services to support a complete data lifecycle, thereby supporting a wider array of use cases, from transactional to operational through analytical.
By largely abstracting away the movement of data between services or products, MongoDB’s developer data platform aims to help enterprises significantly reduce investments in middleware, in terms of people and software/systems, while also reducing the need to reconcile data across systems and help organizations ensure a single source of truth.
The idea of a data platform, or data cloud, isn’t new, and it’s not unique to MongoDB. As I said, this is an industry trend toward vertical integration to make developers’ lives (or data analysts’ lives, in the case of data warehousing vendors) easier. But what is different, and seems completely unique to MongoDB, is this idea of a developer data platform: Something that makes developers much more productive with data.
Clearly a key part of this for MongoDB is analytics, but not those analytics. Even if I didn’t work at the company, it would be hard to come away from MongoDB World thinking that MongoDB was planning to compete for data warehouse workloads.
Instead, the keynotes revealed a lot of thinking about analytical workloads that drive engaging in-app experiences. Like? Well, like personalization applications that determine promotions to display at checkout based on what has recently been shown. Or, like security applications that analyze network activity to separate good domains from bad.
Traditionally, analytical systems that are good for these workloads were separate from operational systems. If separate sounds good, it’s really not, as it introduces cost and complexity through things like ETL, distancing applications from back-office data that feed them their data.
This batch-oriented world may be the status quo, but it makes for bad application experiences for customers. MongoDB clearly agrees, and said so repeatedly at the event through a variety of announcements.
Which brings me back to just how different our industry is today than it was in 2014 and just how much is the same. We still rely on relational databases and will for some time, as I wrote back in 2016. And yet it’s equally true that enterprises increasingly depend on non-relational databases like MongoDB.
In both camps, we’ve seen developers flirt with special-purpose databases, while investing more in general-purpose databases and, more recently, general purpose data platforms. As the Talking Heads might sing: “Same as it ever was.”
Disclosure: I work for MongoDB, but the views expressed herein are mine, not those of my employer. Just ask: They often disagree with me.