Couchbase automates management with 2.0 release of Operator for Kubernetes

The company expects the next release later this year to provide auto-scaling as a feature.

How to create a Kubernetes security policy

On April 28, Couchbase announced the 2.0 release of its Operator for Kubernetes, previously in beta. Operator Patterns, in Kubernetes, are named for the human role they replace, or at least augment, with the potential to do everything from automate installs to automate backups, even scaling on demand. I asked Anil Kumar, director of product management for Couchbase, to explain what that means for customers, and how it was different from the company's new public cloud offering.

Couchbase itself is both an open-source database—a key-value store that points back to documents—and a commercial company that supports the database. One of the differentiators of Couchbase is its container-first strategy, which Kumar explains is tied to Kubernetes, the de facto standard.

SEE: Kubernetes security guide (free PDF) (TechRepublic)

New capabilities

Human operators keep a system, well, operating. They run the scripts to perform the backup and restore, install patch upgrades while coordinating with the customers about downtime. Cluster-based applications are more complex; they might require deploying a half-dozen resources to get a system running. An intelligent human operator might check the resource needs of all of the components, and compare that to the space in the cluster, before performing a deploy. 

That is exactly the kind of intelligence that is not required in a level one "basic install operator" that the team at Couchbase put into their new release, which achieves level four. In addition, the software automates backup and restore, provides for Cross-Datacenter Replication, or XDCR, so you can run in Amazon and backup to Google. The new operator is also easy to configure to export metrics into Prometheus, which is emerging as a standard monitoring tool.

In practice, you configure how to set up a Couchbase cluster as a simple YAML file, stored in version control. That includes where to perform backups, restores, along with policy for upgrades. In many cases the automated operator can perform the upgrade in real time through a blue/green deploy. A human operator would need to write code to have that impact. That code would need to be tested, and, while it might work for one upgrade, it could fail on another. All of the code to automate the tool might need to be tested on new upgrades. With operators, Couchbase takes on the responsibility and provides the support when things go wrong.

While a level three operator provides the "full lifecycle" of install, upgrade, and back, level four provides insights into operations to help fine-tuning. The Couchbase solution for that is to get the data into Prometheus. To each level five, the operator would essentially become a full hands-off system, including automatic scaling up and down, automatic schedule tuning, as well as handling exceptions.

When a node goes down, the operator does not "fire an alert," but instead kills the offending node, finds where the data was replicated or backed up and restores it, all without interrupting operations. That's a tall order, and open to a little bit of interpretation. Kumar expects the next release, in late third quarter or perhaps in fourth, to provide auto-scaling as a feature. 

The real story may just be why this makes sense for Couchbase to develop.

The economics of operators

Couchbase certainly qualifies for the "Lachmann trifecta," it requires scale (clustering), replication, and load balancing. Any company that wants to implement it on a Kubernetes cluster will need to write programs, or at least shell scripts, to monitor performance, plus likely have dedicated humans taking corrective action. 

This is where Couchbase the company starts to separate itself from Couchbase the open-source project. Because Couchbase the company has more than 75 customers that it manages, representing thousands of clusters, it can spread the cost of the operator over all of those customers, with each customer in effect only paying a few percentage points of the project.

While Couchbase the database is open-source, the operator is only bundled as part of the Enterprise Edition. That is a little different from its new, public cloud service. The cloud service is a true containerized Database as a Service (DBaaS) running on your cloud. That allows you to negotiate with Amazon, Google, Microsoft, or IBM directly, which prevents cloud-vendor lock-in. The tools that manage the service presumably are part of the operator, which the company packages and delivers to enterprise customers to manage themselves.

Customer feedback on the operator is largely positive. Brent Burnett, a systems architect at CenterEdge, estimates the operator has reduced IT administration overhead by 80%. That's a big number and deserves some exploring. Kumar explained this meant the operational work, the time spent configuring, tuning, and supporting Couchbase went by four-fifths. According to Kumar, this is less likely to lead to layoffs (cutting out 4 people on the 5 person Couchbase team) than Database Administrators taking on new responsibilities and new projects.

Company officials are counting on that. After all, they use the Couchbase Operator to run their public cloud offering.

Also see

Kubernetes emblem white helm on blue back

Image: Getty Images/iStockphoto

By Matthew Heusser

Matt Heusser launched his company, Excelon Development, in 2011, after spending a decade in programming, testing, and project management. A former member of the board of directors of the association for software testing, Matt is a recipient of the Mo...