Last week, many Debian users got something of a shock when they realized that encryption keys for OpenSSH, OpenSSL, and OpenVPN have all been vulnerable to relatively easy compromise for a while. Previously, I discussed how you can detect and replace vulnerable SSH keys on Debian, and Vincent Danen explained another means to find and fix crypto key vulnerabilities that arose as a result of this snafu. So much for the technical matters -- read on for a quick overview of the rest of the story.
Last week, many Debian users got something of a shock when they realized that encryption keys for OpenSSH, OpenSSL, and OpenVPN have all been vulnerable to relatively easy compromise for a while. Previously, I discussed how you can detect and replace vulnerable SSH keys on Debian, and Vincent Danen explained another means to find and fix crypto key vulnerabilities that arose as a result of this snafu. So much for the technical matters -- now I'll provide a quick overview of the rest of the story.
The problem is that Debian package maintainers used valgrind to profile some security tools and "fixed" some "problems" it found without understanding what they were doing. The problem is that valgrind reports issues with uninitialized memory, which is usually a good sign of a bad bug, but is not always a sign that there's anything wrong. In fact, the uninitialized memory usage in this case is critial to the proper operation of crypto tools like OpenSSH.
By "fixing" the uninitialized memory issue, Debian package maintainers destroyed the ability of tools like OpenSSL to add any entropy to the pool it uses to generate encryption keys. Not only did these maintainers modify the code that offended valgrind to eliminate uninitialized memory use, but they eliminated the ability of these tools to add any new memory to the entropy pool at all. The end result is that these tools were effectively restricted to a very limited set of potential encryption keys, making a brute-force attack not just possible, but even easy, by some measures.
The actual problem occurred with a patch to the OpenSSL libraries, which are used by the OpenSSH and OpenVPN projects to generate random numbers for encryption key generation.
What they did right
Valgrind is a great tool. There's no reason to avoid using it. The fact that it reported something as a problem that, in this case, is not actually a problem, is not where the real issue arose. Tools like valgrind and Purify (which would also produce warnings about uninitialized memory) are extremely helpful tools, and any C developer should familiarize himself or herself with one or both. They can prove invaluable in discovering and fixing security issues, in fact, if they are used correctly.
The Debian package maintainers and the people working with them also employed some due diligence in tracking down the source of the apparent problem to the best of their ability. They discussed possible solutions, and they even chose real solutions rather than just hiding what they saw as a real problem.
The OpenSSH, OpenSSL, and OpenVPN teams, for their parts, have done great jobs over the years of maintaining useful, high quality, very secure tools, and have not neglected the software in their charge when presented with evidence of bugs and vulnerabilities that needed fixing.
What they did wrong
The Debian package maintainers may have discussed their options for solving the problem in a well-reasoned, careful manner, with an eye toward fixing rather than hiding problems, but they missed the single most obvious -- and obviously correct -- option of all. They failed to consider getting in touch with the upstream developers for the security tools in question.
When you have a problem with a piece of software that is developed and maintained by another team that obviously knows a fair bit about that type of software, any bugs and issues you find should be brought to the attention of the upstream developers. This is doubly important when you yourself are not a subject matter expert, and that level of importance increases by an order of magnitude when the software in question is security software.
Developing non-security oriented software in a secure manner requires some knowledge of good development practices, common security failings and vulnerabilities, and effective testing techniques. That's knowledge that any developer should have in his head. Developing actual security software, however, is an exacting, demanding, and highly specialized activity. It requires a level of expertise that one cannot simply fall into. If you are not deeply involved in security software development yourself with a fair bit of experience behind you, you should never take it upon yourself to second-guess the decisions of the security software development experts without an expert or six of your own looking over your shoulder and checking your work.
This is a problem that can come up under pretty much any circumstances, of course -- it's not particular to a Linux distribution project's package maintainers, or even to software that legitimately makes use of code that is often considered a sign of a bug as in the case of uninitialized memory use to refresh an entropy pool. This problem in particular is an even more egregious example of poor handling of a security software issue than usual because it happened with an open source project.
One of the benefits of open source software is that anyone who discovers a problem, or something that may be a problem (but actually isn't, as in this case), can collaborate with upstream developers to solve the problem not only for themselves but for all users of that software. Doing so, in fact, removes some of the software maintenance weight from their shoulders and places it where it belongs: on the shoulders of the upstream developers. As such, the Debian package maintainers' mistake was such that what they did was the wrong answer whether uninitialized memory use was a real problem or not. If so, they would have been fixing the problem only locally, which could conflict with later upstream updates, prevent other downstream users from getting the same fix by not sharing it, and adding to their own workload by taking on maintenance of the fix personally rather than handing it to the people whose job it is to maintain such things. Since it was not a real problem, they "fixed" something that actually should not have been "fixed" at all, and created a huge problem out of nothing.
Regardless of what upstream source is providing the software you use, or redistribute to others -- whether it's a closed source commercial developer or an open source project -- your first instinct after documenting a potential problem to the best of your ability should always be to contact the upstream software developer through whatever appropriate channels you have available to you. Only after you have done so should you even consider fixing it locally rather than getting a fix from upstream (perhaps after submitting a patch to the upstream developers). If it's security software, even then you shouldn't fix it yourself unless you are a security software development expert, or have one on hand to review your work make sure you're not making any silly mistakes.
But wait -- there's more! This is not entirely a problem with the Debian package maintainers. The OpenSSL team may need to clean up its act just a little, too. Getting in touch with the right people at the OpenSSL project is a less than obvious process, apparently. OpenSSL core team member Ben Laurie said that the openssl-dev mailing list, despite its name, is not the place to discuss development of OpenSSL. The OpenSSL Support page identifies it as a mailing list for "Discussions on development of the OpenSSL library," however. Even worse, the "openssl-team" e-mail address he suggests for reporting issues like the valgrind warnings doesn't appear to have been noted in any OpenSSL documentation or Web pages.
The OpenSSL team needs to make their preferred means of receiving such reports more accessible. On the other hand, that doesn't excuse the Debian package maintainers from submitting a patch to the OpenSSL team rather than just patching the code locally in the Debian project and leaving it at that. To varying degrees, it seems everyone involved was culpable.
Another recent, if less disturbing, example of downstream software users mishandling apparent bugs is the celebrated 25 year old BSD bug. The Samba team discovered a bug in BSD Unix code for handling the MS-DOS filesystem which results in a simultaneous file access conflict issue. This problem has, since 1983, affected every BSD Unix and significantly derivative OS, including Apple's MacOS X.
The failure here is that the Samba team wrote a work-around into Samba code when they discovered the issue rather than alert BSD developers to the problem so that it could be properly fixed. Once again, a lack of communication led to a suboptimal result. Once again, the problem is compounded by the fact that the offending parties never bothered to get in touch with upstream developers in an open source project. Luckily, this error in judgment didn't result in a widely affecting, potentially very damaging security vulnerability.
All I can offer as a lesson is a repeat of what I've already said:
- Your first instinct should always be to bring a bug to the attention of upstream developers.
- Even if, after talking to upstream developers, you feel the need to make local changes without forking the project to solve systemic maintenance problems in the upstream project, you should never try to fix a code problem you don't understand.
- Perhaps most importantly, never assume you understand a problem (or even that there *is* a problem) with security software unless you yourself are an experienced security software developer or having one near at hand to double-check your work.
- Finally -- and this should be the most obvious point of all -- take advantage of the open source development model whenever possible. Introducing downstream changes to software you're getting from some other source to fix a bug is exactly the wrong way to do it, in part because you're depriving the upstream developers of the benefit of your development efforts, and in part because you aren't taking advantage of the upstream developers' familiarity with the software, but mostly because it's the upstream developers' jobs to maintain the software after any fixes have been applied.
There's no room for Not Invented Here syndrome in open source software development. When you let NIH get in the way of doing the right thing, you're not doing open source development any longer.