Open source dominates big data. So much so, in fact, that Cloudera co-founder Mike Olson has declared, "No dominant platform-level software infrastructure has emerged in the last ten years in closed-source, proprietary form." He's right, as the vast majority of our best big data infrastructure (Apache Hadoop, Apache Spark, MongoDB, etc.) is open source.
Given the importance of big data, and its outsized influence on the world, it would be easy to fall into the belief that big data dominates open source. This would also be wrong, as a look at GitHub's top launches over the past year suggests. While not a perfect measure of project popularity, the broad diversity of GitHub's most anticipated repositories indicates that open source has influence well beyond core data infrastructure.
Take, for example, mobile. The biggest release on GitHub, as measured by stars, is Apple's Swift programming language. Though not exclusively tied to mobile (Swift supports Apple's mobile-friendly iOS but also the desktop-oriented macOS), Swift was a welcome upgrade to iOS engineers that had been swimming through Objective-C for years.
It's not hard to understand why developers would jump on the Swift repository. As important as the personal computer has been, mobile devices are far more pervasive than the PC ever hoped to be. Enterprises are starting to understand Marc Andreessen's dictum that "software is eating the world," and mobile software increasingly governs how they engage with their customers, as Scott Hudler, chief digital officer of Dunkin Donuts, insisted: "Mobile is the biggest change to our store since we cut a hole in the wall for the drive thru."
But what about the rest of the top 10 biggest launches on GitHub?
Yes, no. 3 is a big data project—TensorFlow, an open source machine learning library—but most of the other projects relate to building out the web:
- React Native (no. 4), a framework for building native apps
- Material Design Lite (no. 5), enables developers to add a Material Design look and feel to static content websites
- Clipboard.js (no. 8), makes it easy to copy text to the clipboard
- create-react-app (no. 9), for building React apps with no build configuration
Diversity by the few
Despite the relatively broad technologies comprised by the list, they come from just a few companies: Google, Facebook, Apple, and Microsoft combined can claim eight of the top 10 biggest launches on GitHub. If we look at the most starred GitHub projects of all time, these same companies populate the top of the list (along with code-learning tutorial resources).
On the one hand, this is to be expected: Bigger companies can make more noise about their open source projects. It's also true that these companies have tended to release significant, industry-changing code, making anything they do worthy of mass developer attention.
Yet, it does close off the idea of "open" source somewhat. The code is open, sure. But, if we're being corralled into the pet projects of just a few mega-corporations, how "open" is that? This isn't to suggest nefarious designs by any of the companies mentioned. Far from it. We should be grateful for awesome code like React Native, TensorFlow, and Swift.
Even so, I'd be thankful for more code bubbling up from the masses like Julian Garnier's Anime or Zeno Rocha's Clipboard.js. That has long been the promise of open source, even if the promise has never quite matched the reality of open source being dominated by big companies with big self-interest.
- Here's what open source critics are missing in their Apple-bashing (TechRepublic)
- Why every developer is an open source developer now (TechRepublic)
- Developers are pragmatic, not religious, about software (TechRepublic)
- Apple's Swift programming language: The smart person's guide (TechRepublic)
- Sun: Open source is about self-interest (ZDNet)
Matt Asay is a veteran technology columnist who has written for CNET, ReadWrite, and other tech media. Asay has also held a variety of executive roles with leading mobile and big data software companies.