Open Source

Why you should fork your next open-source project

Code forks, rarely used, turn out to be remarkably effective at driving innovation, so why don't we use them more? Matt Asay explains.

Open source

Forking isn't nearly common enough.

The right to "fork," or copy an open-source project and take the copy in a new direction, remains one of open source's cardinal virtues, but it's a right that is rarely exercised. Even as open-source projects have mushroomed to millions of repositories, research by Gregorio Robles and Jes´us M. Gonz´alez-Barahona identifies a mere 220 forks.

While the number of forks has increased over the past few years, one thing hasn't changed: forking an open-source project often succeeds. So, why don't we see more forks?

The rise and plateau of the fork

Figure A
As open source has become a standard development procedure for engineers, its incidence has increased — but not linearly.

In 2008, researcher Dirk Riehle found that the population of open-source projects was growing exponentially.

Since then, open-source project creation has maintained that same heady pace, if not increased, due largely to growth in mobile open-source projects, as Black Duck details.

During that time, despite this exponential growth, Robles and Gonz'alez-Barahona uncovered a mere 220 distinct forks in their research ("A Comprehensive Study of Software Forks: Dates, Reasons and Outcomes").

Figure B
And while forks have become more frequent in recent years, their growth hasn't kept pace with the overall growth of open-source projects, as the chart indicates.

Interestingly, the forks aren't confined to any particular area of software. While networking software (e.g., servers, clients, etc.) accounts for 23.6% of all forks, there's a pretty even distribution of forks across web applications (15.5%), development technologies like IDEs (13.2%), multimedia systems for audio/video (8.2%), games and entertainment (8.2%), and so on.

As for the most common reasons developers fork a project, these also are split between disagreements on technical direction (27.3%), discontinuation of the original project (20.0%), a desire to escape a company's control for more community-driven development (13.2%), and other factors.

Most interestingly, however, is that we see so few forks despite the fact that they're often successful.

Forking is effective, so why is it rare?

While some (including me) have assumed that forking rarely works, the data suggests otherwise. In fact, forked projects displace the original projects twice as often as the original projects kill off the forks:

Figure C

So, why don't we see more forks?

In my experience, developers turn to forks as a last resort. To fork a project is also to divide its developer base, create disharmony, and potentially eviscerate all that made the project succeed in the first place. It's not a small matter to break up the family, as it were.

Yet sometimes, that's exactly what's required, and platforms like GitHub make forking trivially simple (even if the technology doesn't take care of all the human factors that complicate a fork).

Boosting innovation with increased forking

Given the ever-increasing importance of open source, we need to make forking less personal and even easier to implement. As Robles and Gonz'alez-Barahona posit:

"Technology should make forks easier; the convenience of forking should just be a strategic matter that allows to maintain balances among the stakeholders of a project. On the other hand, lowering the technological barrier to fork may increase the number of friendly experimental forks that boost innovation."

As the authors suggest, forking shouldn't be the acrimonious affair that it's traditionally been. If a developer (or group of developers) thinks he or she can improve on existing code, let's make it easy to do so. Atlassian points to some of the benefits for enterprise innovation, including the unshackling of individual expertise: "A non-sanctioned team or a lone developer with interest in the matter can fork the project and start contributing without requiring supervision and without disrupting the core team's work."

While groupthink is often good for keeping a project cohesive, it's just as often bad for innovation. By encouraging the use of software forks, projects and enterprises alike should see a significant uplift in innovation.

About Matt Asay

Matt Asay is a veteran technology columnist who has written for CNET, ReadWrite, and other tech media. Asay has also held a variety of executive roles with leading mobile and big data software companies.

Editor's Picks