A new startup launched by Facebook operations engineers hopes to take debugging into the 21st Century. Here's how.
What happens when software eats the world, but is so riddled with bugs that the resultant world becomes a nightmare of failing services? HackerOne CEO Marten Mickos has one answer: Bug bounty programs, designed to provide financial incentives for talented hackers to discover and destroy bugs.
Former Facebook engineering lead and Honeycomb co-founder Charity Majors (@mipsytipsy), however, has a different answer: Debug live, distributed services and squash those bugs before they squash you. While admirable in tackling yesterday's legacy systems, it's almost Quixotic when one considers the rampant system complexity of today's applications.
Officially launched on Tuesday, Honeycomb.io bills itself as "the fastest, simplest way to debug with data" and, given the founders' pedigrees, it just might have a valid claim to that tagline. Despite the presence of roughly 1.2 billion other debugging options available to developers, developer darlings like Docker and GitHub have already jumped on the Honeycomb bandwagon, suggesting that it may provide a missing link in software development.
In an interview with Majors, she tackled the alleged futility of standing out in a crowded debugging market, compared Honeycomb's approach to the NoSQL concept of flexible schema, and didn't swear even once (which shows tremendous restraint, if you know Majors).
This shouldn't work, but it does
When Majors first reached out, one question loomed for me: How does Honeycomb hope to stand out, both product-wise and, frankly, marketing-wise, given a plethora of debugging alternatives (GDB, Microsoft Visual Studio Debugger, etc.)? Majors' response was classic: "Everybody told us we would struggle to be noticed or stand out, but we...haven't. We have gotten remarkable traction—buzz, signups, even paying customers—just by word of mouth and my Twitter feed until now."
So...why? Why would yet another debugger make a difference to anyone? It turns out that solving an unsolved problem actually matters. It also turns out that no one has come close to solving the debugging problem for the extraordinarily difficult problems that "emerge from the interaction of software, services, systems, and user behavior which require exploratory analysis," otherwise known as debugging, Majors said:
This is still an unsolved problem. People are hungry for it. Even if they don't know that what they're dying to get is 'a high cardinality, high dimensionality, interactive real time tool for slicing and dicing production data in real time,' everybody feels the pain. Everybody is still jumping around between metrics and APM and log aggregators to answer a single question. Everybody is still failing over to hand munging the raw logs from S3 as a last resort, and building homegrown patches to bridge between components. Everybody can tell that the existing tools are baling wire and duct tape and a hammer.
The easy problems have been "automated out of existence," Majors said, leaving only the hard problems. These hard problems aren't the known-unknowns, the sort of thing you can set up a cascading array of dashboards to track. They are the unknown-unknowns, the sort of brutally difficult problems that require asking questions of one's data without having to "predefine the schemas or the indexes, which makes it so you have had to try and predict what data you will need in order to answer a question you couldn't predict asking."
Back to the future of event-driven debugging
This concept of flexible schema is reminiscent of NoSQL, and it's one of the things that makes Honeycomb so potentially revolutionary. As Majors continued:
Modern infrastructures and architectures are fluid, dynamic, ephemeral. Anyone who is pricing or structured in a way that assumes the existence of a 'host' is fundamentally operating under the old way of doing things. You don't know what data you're going to need, and you don't know what you need to be able to efficiently search on—the answer to both is 'everything.'
The approach invokes "a BI kind of mindset to building and operating software: Ask questions of your systems, services, and code." This is critical because, as Majors told me, "There's only so much you can even predict or guess or write tests for. Users add chaos and are inherently unpredictable. [You have to] get comfy with that."
For those born before Duran Duran and Culture Club, this feels like a return to 1980s-style event driven debugging. Because, well, it is. As Majors explained, "Every mature company is going to have a metrics store and an event store. But, when given a choice, developers always, always choose the event store for their debugging. Events just need to grow up and get past the log aggregator stage—unstructured logs are just a hack for what people really want and need, which is a structured event store."
For those developers working on legacy code in legacy corporations, have no fear. Honeycomb has made it easy to ingest data, whatever the source: "We accept JSON blobs and a write key. That's it. You can stream literally any data source into Honeycomb," Majors said.
You don't have to eat your pizza alone
Software development is an inherently social, collaborative activity. Debugging should be, too. At least, that's one final piece of the Honeycomb puzzle.
"The art of being a great debugger consists of finding and leaving breadcrumbs, integrating hints and clues, accessing archives of past outages and events, following the changelog trail," Majors said. "The mechanics of debugging are social, even if it is literally only one person, you're still collaborating with your past self and past investigations, and leaving markers to guide your future self."
But let's be honest: Not every social debugger is good at it. So what does Honeycomb offer for the not-so-great debugger, that person who does it because they have to? "With Honeycomb, we can bring everyone up to the level of the best debugger, by capturing how their brain works and making it accessible to everyone," Majors said. This "brain capture" might involve shared queries, playlists, activity histories, and more. The point is to understand the questions these skilled debuggers ask, and why.
As it stands, we may be entering the second great stage of DevOps. According to Majors, the first phase was about the operations folks learning to write real software. This second phase, she continued, is about software engineers learning to operate their services. Along the way, Honeycomb hopes to make software debugging useful and social. It just might work.
- Top 10 challenges to DevOps implementation (TechRepublic)
- Gap between DevOps-savvy and non-savvy companies is huge, survey finds (ZDNet)
- What is DevOps? 3 things you need to know (TechRepublic)
- How one company's DevOps success got them the green light to hire 1000 developers (TechRepublic)
- HackerOne CEO: The tech industry has some 'catching up to do' on software security (TechRepublic)