
Amazon Web Services cloud chief Andy Jassy wants you to run all your workloads on the AWS cloud alone. While multi-cloud sounds cool, and could be a great way to avoid lock-in, it actually increases costs and complexity, he argues.
Well, he would say that, wouldn’t he?
Even so, he’s right. There’s a reason enterprises tend to standardize on just a few vendors: It just makes life easier. For those enterprises that still lie awake at night, worrying about lock-in, here’s a tip: Buy all the cloudy infrastructure you want from your vendor of choice, but try to keep your data separate.
SEE: Why some of the fastest growing databases are also the most experimental (TechRepublic)
Let me explain.
All your data are belong to us
Today, it’s very common to build on AWS, Microsoft Azure, or Google Cloud Platform, and host a database of your choice there. Plenty of companies bring their own database to the cloud, be it Postgres, Cassandra, Oracle, or Microsoft SQL Server.
However, a trend has accelerated, and it should be worrisome to the lock-in conscious.
In an effort to make the cloud even simpler for customers, the major cloud vendors have been developing their own database services. AWS has its RDS, Aurora (fastest-growing product in AWS history), DynamoDB, and more. Microsoft has Azure DocumentDB and Azure SQL Database. Google, not to be outdone, has Cloud BigTable, Cloud Datastore, and Cloud SQL.
Yes, these are helpful services. But, they’re probably exactly where you don’t want to store your data if you’re looking to escape lock-in.
Separating your data from the infrastructure
This thought hit me while reading up on MongoDB’s new database service, Atlas. While the company promotes Atlas as a better way to run the popular NoSQL database, its real benefit may have more to do with lock-in, or the lack thereof.
Though the big public clouds are built on open source, “they are probably the most proprietary software in the history of software,” MongoDB vice president Kelly Stirman told me in an interview. This becomes even worse given the direction each of the clouds has taken of building out more and more services so that there’s never a reason to leave.
SEE: NoSQL keeps rising, but relational databases still dominate big data (TechRepublic)
The problem becomes most acute with the database, Stirman tells me. Why? Because “the database has the most inertia,” he explained. “It’s the hardest thing to move because it has state. And it has the most valuable asset, the data itself.”
Once your data sits in Azure DocumentDB, it’s really (really) hard to move it if you want to go to AWS, for example. You’re stuck.
That’s not because Microsoft (or AWS or Google) is evil. Not at all. It’s just because the gravity holding that data in place is very, very strong.
This is one of the major selling points of Atlas. As Stirman noted, “Atlas will run on multiple clouds, and offers the same features you get on-prem. Users can move between clouds if they want, and even deploy across multiple clouds.”
While the costs and complexity of running on multiple clouds can add up, as mentioned above, there are good reasons for retaining that power of movement between them:
- Maybe Azure has better coverage in China, so you’ll use AWS everywhere but China where you have copies of data on Azure.
- Maybe you like the pricing for Google Cloud Platform for where you want to store your backups, but your primary copies are on AWS.
- Maybe you grow so much that on-prem makes sense, then you move to your own datacenter.
For battling lock-in, and in general, “the database is the most important tech decision,” Kelly said, “and it’s the one you are going to revisit the least frequently.” As such, it may be smartest to keep the database independent from the underlying infrastructure. The big clouds may not like it, but your peace of mind just might.