How to scale Kubernetes: 3 factors

Kubernetes is becoming the container orchestrator of choice. It even holds the ability to scale applications up with demand--if you know how to configure it.

How to create a Kubernetes security policy

After Docker provided a small, lightweight virtual machine that could run on your laptop, Kubernetes came next to provide a real production cluster. As it turns out, running a production cluster is complex work. For example, Kubernetes does not come "out of the box" with the tools to manage scaling.

In this article, I'll discuss three levels of scaling in Kubernetes. First, I'll make the application aware of its resources, then I'll configure the Kubernetes to request more virtual machines from a cloud, then finally, I'll talk a bit about high availability and balancing between multiple clusters.

SEE: Kubernetes security guide (free PDF) (TechRepublic)

Application Scaling

While Kubernetes is capable of scaling an application up or down, "elastically" that work needs to be configured. For example, Kubernetes can track the resources that any one application will need to make sure it does not throw too many resources on a virtual machine. You can define the memory, CPU, and network bandwidth needs of an application. To do that, you need to profile the application in production to figure out what those needs are, then express those needs in the configuration of the pod. Without that information, the scheduler will assume the resource needs are zero and could easily overload a virtual machine with pods.

Assuming the application is designed to scale, you may want to run multiple pods for high availability. That way you can perform rolling upgrades and keep uptime close to 100%. The Kubernetes documentation has a tutorial on scaling deployment, but that is a manual scale up and scale down of resources. The Horizontal Pod Autoscaler can monitor CPU, memory, and other metrics, and add and remove pods as needed.

The cluster itself will have resources. If it is running in a pay-per-CPU-minute cloud, you want to keep that number as small as possible. This introduces a new question, of how to get the cluster to scale with demand.

Cluster Scaling

Out of the box, Kubernetes does not--and cannot--provide tools to scale itself. The cluster is a cluster and is not aware of other resources outside of itself. It is possible, however, to write a middleware tool that monitors utilization that is also connected to some other service that can provide virtual machines. This could be a public cloud, a private cloud such as OpenStack, or a virtual machine farm using a tool like VMWare. Microsoft, Amazon, IBM, and Google all provide this autoscaling technology in their cloud for Kubernetes users, and there are open source Autoscaler tools.

Another option for cluster scaling is OpenShift, Red Hat's Container Platform, which runs Kubernetes clusters. OpenShift can have resources assigned to it and can manage scaling up and down those resources. It can also manage a hybrid-cloud environment, where some resources are on premise and others are in a public cloud.

The idea of a cluster asking for more resources is one thing. What if you want to run multiple clusters?

Multiple Clusters

The problem starts as simple as having a development, test, and production cluster as separate entities. Without this (or specialized throttling limits), it might be possible for a performance/load test of an elastic cluster to take out the production cluster. Add to that needs for high availability, which might mean multiple clusters in different zones in a cloud provider. Then you have different operating units, running multiple clusters in different countries, plus the ability to have customers routed to a data center on their continent.

Jason McGee, the CTO of the cloud platform at IBM, explains that this is a multiplication problem. Do the math, and a multi-national might have dozens of different Kubernetes clusters. That makes getting a holistic picture of what is really going on quite a challenge--not to mention actually managing the resources and costs.

To manage their own 22,000-plus cluster cloud, IBM built Razee, a tool they could later offer as open source.

SEE: Kubernetes: A guide for IT pros and business leaders (TechRepublic Premium)

Other Options

At this point in history, taking on the challenge of scaling Kubernetes is basically volunteering to become a full-on digital company. This worked for Amazon, which turned Web services into a $7.4 billion dollar business. If your company is not Amazon, you might consider the words of Homer Simpson: "Can't somebody else do it?"

As another option, you might consider the purpose of your cluster. Instead of going broad, you might want to do something very specific, such as big data mining in Hadoop, or enabling a NoSQL database like Redis. Specialized providers, like Redis Labs, are beginning to create managed services offerings, designed to handle one application and handle it well. Alex Miłowski, a product evangelist at Redis, explains how the company understands these scaling problems and created an operator tool to manage Kubernetes cluster running the Redis NoSql database. The service offering can work in a local cluster (on-premise), in the cloud, as a managed service in the cloud, or even to manage Redis running locally on bare metal servers.

Over the next 24 months, I expect to see a growth in specialized cloud services, followed by either those services being acquired by the major cloud players, or the cloud players creating their own competing offerings.

So keep your eye out, and don't blink. Things are changing fast, and you don't want to miss it.

Also see

Kubernetes emblem white helm on blue back

Image: Getty Images/iStockphoto