Date Added: Jul 2011
Rigorous identification of vulnerabilities in program code is a key to implementing and operating secure systems. Unfortunately, only some types of vulnerabilities can be detected automatically. While techniques from software testing can accelerate the search for security flaws, in the general case discovery of vulnerabilities is a tedious process that requires significant expertise and time. In this paper, the authors propose a method for assisted discovery of vulnerabilities in source code. Their method proceeds by embedding code in a vector space and automatically determining API usage patterns using machine learning. Starting from a known vulnerability, these patterns can be exploited to guide the auditing of code and to identify potentially vulnerable code with similar characteristics - a process they refer to as vulnerability extrapolation.