Docker is hot. Really, really hot. But whether it’s hot for you… well, that’s a different story.
Just as Hadoop has seen far more adoption among Silicon Valley web giants and startups than mainstream enterprises (though this is changing), so too may Docker be a perfect fit for avant-garde upstarts while not a great solution for legacy apps. As Microsoft’s Jason Perlow tells it, Docker “containers are great but it’s going to be an issue for legacy apps…. It’s for greenfield” deployments.
While Perlow is likely right –your legacy application will die before it sees the light of Docker — it’s also true that Docker’s benefits extend well beyond greenfield applications at startups.
Why you can’t ignore Docker
Even among other hot technologies, Docker stands out. This is particularly true among the startup set, which can be viewed as a leading indicator of where technology is heading. Linux, Hadoop, and other technologies tend to get started with Silicon Valley startups and then trickle down to the more risk averse mainstream enterprises.
Docker is a darling among the startup-oriented Hacker News crowd. According to Ryan Williams’ excellent Hacker News Hiring Trends report, Docker is one of the fastest-rising technologies on the whoishiring discussion thread, jumping 13 places to 42nd place in December, from 55th overall in September.
While still behind other popular technologies like AngularJS, Hadoop, and MongoDB, its growth is impressive:
Nor is it limited to startups. A quick glance at Google search data suggests that interest in Docker is deep, widespread, and about to surpass Hadoop.
As I’ve written before, there are a number of reasons for Docker’s seemingly sudden rise. Container technology has been around for decades, but it was the spread of Linux and virtualization that paved its way. Plus it doesn’t hurt that Docker makes life easy for developers, as Microsoft’s Patrick Chanezon posits.
But paved the way to what? Why, exactly, does it matter?
There are three primary reasons:
- Docker provides a better way to package and distribute software.
- Docker offers a standard API and lifecycle model for applications.
- Docker delivers lightweight resource isolation.
I’ll tackle each of these in turn.
Making software packaging and distribution simple
There are many parts to a modern software system. There are binaries, libraries, configuration files, dependencies, and more. While assembly of these diverse components can be complicated on a single machine, things get much more complex when you “ship” that software.
Once you ship your software, you need to figure out a way to package all these things together and put them where they need to run. But this is hard because most packaging systems don’t do a good job of wrapping all these things to together.
I discussed Docker with MongoDB’s Jared Rosoff, who reminded me of the bad ol’ days of “DLL Hell,” when developers were pulling their hair out over Windows deployments. Those stuck in DLL Hell were embroiled in the problem of installing a piece of software that depended on a library, while another piece of software depended on a different version of that same library.
Docker, he explains, makes it easy to avoid this DLL Hell because all the dependencies of an application are wrapped up in a single package.
Containers make it easy to package up our software along with all its dependencies and ship it to the developer across the room, to staging or production, or wherever it needs to run. It’s a way to ensure a “smoother transition from development to production,” as Trend Micro’s Mark Nunnikhoven highlights.
Docker achieves this through the equivalent of “static linking” in your compiler. Rather than dynamically loading all of an application’s dependencies when running a program (dynamic linking), the dependencies are baked into the executable itself (static linking). This makes it easier to ship software around and run it wherever a developer chooses.
This approach, as Neo4j’s Kenny Bastani puts it, means that with Docker, “Containerizing applications becomes as easy as creating a recipe. Deployments are extensible, lightweight, and easier to manage.”
These are benefits any developer needs, whether working for a startup or a Fortune 500 enterprise.
But wait! It gets better.
Providing a standard API and lifecycle
Not only does Docker make it easy to package and distribute applications, but it also goes a step further by providing an API to install, start, and stop containers.
For example, running a Linux process traditionally requires typing in some command to a shell. This works great if you’re sitting at a terminal and manually starting something up. But when you’re trying to automate things, you end up basically writing a screen-scraper to interact with the shell for you to start, monitor, and stop processes. It’s onerous (read: horrifically painful) to write these scripts, and even worse, they’re error prone.
Docker and containers, generally, wrap an API around this process that dramatically simplifies the process of writing software to control the processes running in your data center.
Lightweight resource isolation
At Docker’s core is Linux’s cgroups (Control Groups), which offers ways to account for and limit the amount of CPU, memory, network, and disk resources a container uses. This provides some of the benefits of virtualization — the ability to carve up a computer into smaller chunks of resources so you don’t have one process take over all of the computer and starve the others — without the heavy overhead (or cost) of VMware.
Will Docker work for you?
All of this sounds great, right? But that doesn’t mean Docker will fit all enterprise needs.
John Deere web developer Matthew Nuzum, for example, tells me that Docker’s containers are “cool for dev[elopers] and people who manage a lot of apps and want isolation. However, if you have big apps that use multiple servers,” this isolationist approach is less of a draw.
Indeed, MongoDB’s Vijay Vijayasankar says, “Docker democratized [Linux Containers] for [the] average Joe developer,” but also points out, “It is the ‘me me me’ approach to app dev. Totally self contained.”
All of which means enterprises need to be realistic about what problems they’re truly going to solve with Docker. As Rosoff said, “Docker probably isn’t the platform that people make it out to be any more than your compiler’s linker is a ‘platform.'”
So should you consider using Docker or containers? Probably.
But you should also think about Docker in roughly the same way that you would think about using binaries and processes on an operating system. After all, Docker is simply a way of packaging and running more complicated processes and keeping them separated from the other processes running on a computer.
What’s probably more interesting than Docker is the other software that Docker makes easier to write. If you’ve ever tried to build a distributed application management system, you will immediately appreciate a nice REST API to install, start, and stop processes, rather than the old way of cobbling together SSH, FTP, Bash, etc.
Containers are also useful in some limited multi-tenancy applications.
If you want to run multiple jobs on a single server, the traditional approach would be to carve it up into virtual machines and use each VM to run one job. But VMs are slow to start given that they must boot an entire operating system, which can take minutes. They’re also resource intensive, as each VM has to run a full OS instance.
Containers offer some of this same behavior but are much faster because starting a container is like starting a process. Docker containers also require much less overhead: They’re really no more expensive than a process.
Docker, however, is not a replacement for virtualization because:
- All your containers have to share the same underlying OS, so you can’t run, say, a Windows app and a Linux app on the same server.
- Containers provide much weaker security isolation than virtual machines, so they may not be appropriate for certain types of multi-tenancy.
The downsides of Docker
Given Docker’s promise, it seems almost churlish to pick at its foibles, but weak security and immature tooling are two of the biggest complaints I hear. Some, like Bastani, complain about Docker’s “scalability and cluster management,” noting that, “If I want to deploy a 10-node Apache Spark cluster using Docker, [it’s] not easy.”
Some of these things come down to Docker’s youth. The project was released in 2013, which means there are big unknowns about its operational stability, as well as about its direction.
In fact, Docker is young enough that it’s moving at a blistering pace. As Rosoff pointed out to me, Docker’s website has documentation for each of the last 15 (FIFTEEN) releases of the docker API since May 2013. For enterprises that want slow-and-steady release cycles, Docker is nowhere near ready.
Also, as popular as Docker is today, we may simply be infatuated with a cool new thing. Already competitors are swirling. CoreOS, for example, just released a competing Docker runtime, Rocket, and Ubuntu has its LXD container project. It’s very possible that a year or two from now the container landscape will have dramatically changed.
One reason for the rise of competitors is a desire by some to escape Docker’s closed ecosystem. Everything depends on the Docker registry, which means you must rely on Docker Inc.’s registry or run a copy of Docker in your own datacenter, which you need to pay for.
One of the notable things about CoreOS’ Rocket project is the ACI specification, which basically means a container can live anywhere, thereby offering a lot more flexibility in how and where you store your container images.
Even bigger than Docker
As hot as Docker is, perhaps even more important is something Docker gives us: the API-ification of one of the most core components of the operating system, installation, and lifecycle of software components. This is going to enable a whole new generation of management tools to grow up around containers.
That’s a big deal.
It’s actually pretty surprising that the OS is only just now getting an API to do this, especially given the focus the industry has had on APIs and software control of everything. Sure, technically we had low-level API calls like fork and exec, but we’re now several levels up the stack with containers and able to do much more “stuff” with fewer API calls.
This container-led API-ification is making its mark on tools like PaaS platforms, which let me take a piece of code, push it out to hundreds of servers, and scale up and down resources to the app as needed. It would have been extremely difficult to do this had we not had a mechanism to package the software, distribute it, and start and stop it as needed.
Going forward, we’re going to need a lot more to be API’fied, though. In addition to containers, we need programming networks and programmable storage. These exist, but they’re not yet integrated.
Applications are more than a single container. Interesting applications comprise many containers that must be orchestrated to make a whole app. There are a lot of approaches being pushed today with no clear winner. Compounding this problem is the fact that stateful services like databases and message queues are underserved by containers. This will need to change if we ever want to truly deploy whole apps with a Docker-like approach.
All of which is simply to suggest that Docker, specifically, and containers, generally, still have a long way to go. But given how much promise they show — and the developer reception to them — the answer to whether your company should evaluate Docker is a firm “probably,” with caveats.