Is open source software more or less prone to show-stopping bugs than proprietary alternatives?

The answer is likely ‘it depends’, but both arguments have been put forward forcefully online in the wake of the Heartbleed bug in the Open SSL code library, which went undiscovered by the open source community for two years.

One firm that has been poring over open source and proprietary code looking for such defects for years is Coverity.

This year, for the first time, the company found the open source code it looks at had a lower density of defects than the proprietary code examined by its tools.

Coverity looks at the code base of more than 1,500 open source projects, with the largest being NetBSD, FreeBSD, LibreOffice and the Linux kernel, as well as scrutinising various Java projects such as Apache Hadoop, HBase, and Cassandra. Coverity doesn’t reveal which proprietary code scans are used as a comparison, but the firm’s customers include Microsoft, SAP and RSA.

The testing house found that as code bases grew in size, so did the number of defects per thousand lines of code, but that the growth was slower in C and C++ open source code bases than proprietary alternatives.

Eric Lippert builds C# analysers at Coverity, and believes there has been an increased focus on code quality in the open source projects it examines.

“One factor that’s pretty clear is the open source community has really embraced this idea of running static analysis tools, paying attention and fixing the bugs quickly. That really drives the number of defects down. The total number of defects fixed in 2013 was 50,000 in the open source projects. That’s just a huge number of defects that have now gone away. That may speak to why the density is now lower, because they removed the ones that were found in the past.

“It used to be the case that if we found a defect in Linux, months could pass between when the defect was reported and when it was fixed. Nowadays it is much more likely to be on the order of days or hours, which is great, because the longer a bug is in a product the more likely it is that somebody is going to make a dependency on that bug, or that somebody is going to read that code and misunderstand it.

The nature of these defects are generally “resource leaks or null pointer references, because those are the type of things the code does all the time”, according to Lippert.

As for why defect density increases with the size of the code base up to one million lines, Lippert said that’s quite simple to explain.

“If you’ve got 100,000 lines of code you’ve probably got a relatively small number of developers who’ve been working on the project for a relatively small amount of time. There simply hasn’t been as much time for them to introduce problems and they probably all communicate well with each other and have large areas of ownership.

“If you imagine a code base with one million lines of code, it’s reasonable to assume that code took a much longer time to write, that the team that wrote it was larger and there has been turnover in that team over time. The people who did understand the code have gone and the new people make mistakes and the people reviewing the code aren’t the experts anymore. Also, in large code bases pieces of code tend to become interdependent in ways that they don’t in smaller code bases. So it’s not a surprise to me at all that we see a considerably larger defect density in large projects than we do in smaller projects.”

Above one million lines of code, defect density declines, a change Coverity attributes to a heightened focus on quality controls in the largest projects.

There are limitations to how broadly the findings of the report can be extrapolated. The finding that the open source code bases have an overall defect density of 0.59 per 1,000 lines of code, compared to 0.72 for proprietary software, may be influenced by the average proprietary project analysed by Coverity having three times more lines of code than open source alternatives.

There is also the nature of the code base studied by Coverity, perhaps the rate at which defects are fixed in the Linux kernel and other projects Coverity analyses isn’t representative of broader open source trends.

What about Heartbleed?

The revelation earlier this year of the Heartbleed flaw in the Open SSL library that helps encrypt communications over the internet led some to question the security of open source code.

While Coverity has found bugs are being fixed faster in open source projects, Lippert believes the fact that Heartbleed went undiscovered for two years demonstrates that simply having an open codebase that anyone can scrutinise isn’t enough to increase security.

“There’s a famous quote, that ‘given enough eyeballs, all bugs are shallow’ [made by open source software advocate Eric Raymond and known as Linus’ Law]. This is a good indication that though many eyes might make all bugs shallow, they don’t make bugs shallow fast enough. Two years is far too long for a huge vulnerability to be present, not just in a piece of open source software but one specifically designed to solve a security problem. That class of software should be the most heavily vetted by experts. It’s very disappointing to me that it took so long for the defects to be found.

“The idea that lots of people are going to look at it is not enough, we need better tools, we need a better culture around code reviews, a better attitude towards what makes code quality in high security software.”

Coverity’s analysers also didn’t detect the Heartbleed flaw, though Lippert said the firm has since written an analyser “that does find that class of defect”.

Raymond argues that to question the validity of the “many eyes” aphorism on the basis of Heartbleed going unnoticed is to ignore similar oversights in proprietary code bases.

“The mistake being made here is a classic example of Frederic Bastiat’s “things seen versus things unseen”. Critics of Linus’s Law overweight the bug they can see and underweight the high probability that equivalently positioned closed-source security flaws they can’t see are actually far worse, just so far undiscovered.

“Sunlight remains the best disinfectant. Open source is no guarantee of perfect results, but every controlled comparison that has been tried has shown that closed source is generally worse.”

Lippert also cites instances of defects going unnoticed within proprietary code bases for years.

“We saw it recently when Microsoft announced there is a security defect that has been in Internet Explorer for years and has only just been discovered.

“This sort of thing happens all the time. My point about open source is just having a lot of eyes on a problem is not enough. We need better tools than just a lot of people looking at it, because smart people can look right past these kind of defects very easily.”