Stop adopting multicloud to achieve application resilience, says Honeycomb's Charity Majors

Commentary: Multicloud may be good strategy in some instances but not for delivering application resilience. Here's why.

cloud.jpg

Image: GettyImages/da-kuk

For enterprises still searching for the holy grail of multicloud, Honeycomb co-founder and CTO Charity Majors has a suggestion: keep looking. Or, rather, don't. 

To be fair, Majors doesn't suggest that multicloud isn't useful for application portability (though Gartner has), or that it can't be a great tool in contract negotiations (Duckbill analyst Corey Quinn will assure you that it's not). No, Majors' point is merely that multicloud won't yield "magical pixie dust" of resilience. Put more bluntly, she declares, "it is NOT the way to fix your reliability problems." 

Why?

SEE: Research: Managing multicloud in the enterprise; benefits, barriers, and most popular cloud platforms (TechRepublic Premium)

When multicloud works

But first, it's worth remembering that despite all the hype around multicloud, it's mostly just how enterprise IT works and always has. If one great promise of cloud computing was to unshackle developers from cumbersome hardware procurement processes ("Yes, I need a server for my application. Can I get that by next year? Please??"), that freedom has led developers to buy into a range of clouds to access cloud-specific services to build their applications. 

This is what I call "incidental multicloud." It just happens.

There is also "intentional multicloud," but where I primarily see this is from vendors (including where I work now, MongoDB) who need to meet their customers wherever they happen to be. If a customer comes to Confluent and wants to run the Confluent Cloud on Microsoft Azure, it's a poor sales strategy to tell them, "No, we only support Alibaba" (or whatever cloud). So vendors with cloud services may support multicloud to ensure customers can use data stored in different clouds to run a single application (perhaps storing data in Cloud X for cost/other reasons but analyzing that data in Cloud Y due to superior analytics services, without pushing the customer to manually move the data). 

But that's the vendor taking care of that heavy lifting for the customer to ensure a seamless infrastructure/database/application/whatever experience for the customer. That's not what Majors calls crazy. For customers who assume the burden of multicloud management in the hopes of greater resilience ("I'll string my app across multiple clouds in case one goes down"), Majors wants to talk you out of it.

It turns out that the path to safety isn't increased complexity.

Making hard things even harder

"I understand not wanting a single point of failure. But when you add a cloud you don't get more reliability; you almost certainly get less," Majors noted. Why? Because adding complexity doesn't simplify things. Majors added: "Instead of worrying about AWS being down a few min a year, now you have to worry about AWS, GCP, and the unholy plumbing between them." 

This seems obvious, yet apparently it's not. Not to everyone, anyway.

In fact, she went on, it's the plumbing that creates the most problems, "because you are not better than AWS or GCP at building and operating systems. Promise." As mentioned above, there are companies who do have the resources to ensure seamless operation of particular infrastructure/services between clouds (continuing the example above, Confluent promises it can "automate[] building and monitoring data pipelines and streaming applications, while offloading the operational burden of your developers"), but they don't purport to be generalists delivering resilience and other benefits no matter the workload. They're specialized so that IT needn't be.  

Multicloud may well help with availability issues, but it's not really a strategy for getting to improved resilience–not without help anyway. There's simply too much that can't easily be predicted in the connections between the clouds (Majors: "[I]f you're treating [multicloud] like a hot failover, you will not test it often enough to prevent nuts & bolts flying off each time"). Again, if you have a vendor with the expertise to handle these connections for you, fine. But trying to run generalized workloads across clouds on your own isn't a recipe for resilience–it's a recipe for lots of resilience whac-a-mole. 

Disclosure: I work for MongoDB, but the views expressed herein are mine alone

Also see