Innovation

End of the fail whale: How Twitter aims to help mere mortals scale like a web giant

Twitter knows scale and now wants to help you do the same. Matt Asay explains.

Twitter

Many enterprises dream of running at Twitter or Google scale, but much of the best technology, while open source, is so complex as to be unapproachable.

That's about to change.

As announced on Tuesday, Twitter and a gaggle of like-minded enterprises have formed the Cloud Native Computing Foundation under the aegis of The Linux Foundation. The purpose? To make it much easier (and less chaotic) to develop internet-scale applications.

To better understand why Twitter would bother to contribute to projects and foundations that might simply benefit competitors, I sat down with Chris Aniszczyk, head of Twitter's open-source office (and sometime snowboarding companion).

TechRepublic: Tell me about how Twitter works with the open-source community and where some of the projects you back have become strategic to your business.

Aniszczyk: At Twitter, we have an open-source office dedicated to ensuring that we have a healthy reciprocal relationship with open-source projects that our business depends on. Moreover, we have a team of developer advocates dedicated to growing open-source communities that are important to us.

As I've said before, Twitter's open-source office helps us to avoid the expensive consequences of maintaining a long-term code fork internally, pushing changes to the upstream community.

In terms of projects that we back that are strategic to our business, we tend to focus on infrastructure projects that help us scale our service. A good example of this would be Apache Mesos, which we helped shepherd from an academic project at AMPLab into a mature open-source ecosystem at the Apache Foundation.

TechRepublic: That makes sense as to how you'd streamline your open-source involvement, but why bother joining the Cloud Native Computing Foundation (CNCF) that Google spearheaded?

Aniszczyk: We strongly believe in the premise of cloud native computing and have been running our infrastructure in that fashion for years with Mesos. Kubernetes is growing and is great for writing simpler services but has issues to overcome running state/stateless services, resource sharing, security, and scaling to web scale clusters.

By joining the CNCF, we hope to bring Kubernetes and Mesos closer together, as Mesos has already been established as a first-class framework for working with Kubernetes, and we expect this work to accelerate under the foundation to bring the two ecosystems together more formally. The first point of order will be to have the Technical Oversight Committee review the technology stack within the CNCF and put together a formal plan for convergence.

Fragmentation is happening with Mesos, Kubernetes, CoreOS, Docker, and more. We are looking at the foundation to simplify developers' lives through standardization so that we can continue to innovate instead of making all of our lives harder. At the end of the day, we really want to make cloud native technology ubiquitous and easily available for everyone, from small companies to larger enterprises.

TechRepublic: Twitter and Google have in common the blessings of armies of technical engineers. Google seems to have exposed some of their secret sauce for running containerized workloads at scale with Kubernetes. What do the rest of us need to achieve Google and Twitter operational efficiencies at scale?

Aniszczyk: While these technologies are open source and readily available, they aren't the easiest to use for mere mortals, as distributed systems are hard. My hope is that the CNCF will make this easier in the long run—but in the short term, there are startup companies forming around these technologies.

For example, Mesosphere is making it easier to consume Mesos ecosystem technologies via their DCOS technology. I expect to see this pattern continue, where businesses emerge to make it easier to consume technologies in this space.

TechRepublic: At MesosCon last year, Twitter's Bill Farner said that the company has an amazing ratio of servers to SREs (sysadmins). How you are able to manage such a large-scale infrastructure using Mesos?

Aniszczyk: There's a lot involved, but the basics are to expect failure from the beginning and have your underlying infrastructure exposed as a set of compute resources to run tasks (applications) against. With a flexible API via Mesos, instead of a static set of machines you manage with brittle devops technologies, you can rely on fancy schedulers (frameworks) like Apache Aurora and Marathon to handle elasticity and failures.

Elasticity allows for fault tolerance, so if a service (or even a machine) within a Mesos cluster fails, the schedulers can move the applications and services that were utilizing this resource to somewhere else within the cluster.

Also see

About Matt Asay

Matt Asay is a veteran technology columnist who has written for CNET, ReadWrite, and other tech media. Asay has also held a variety of executive roles with leading mobile and big data software companies.

Editor's Picks

Free Newsletters, In your Inbox