Antivirus developers need to run malcode in their labs in order to create malware-identifying signatures. What happens if they can't?
- The ability to run malware in their labs.
- The ability to automatically analyze malware.
What if they couldn't do either?
Why the dependence?
The traditional antivirus client constantly refers to a database containing signatures of identified malware. Creating an entry for the signature database requires analyzing a copy of the malware. If that's not possible, malware is free to do its dirty work and the antivirus client is none the wiser.
Next problem. Nefarious types create more than 50,000 new malware strains each day. Analyzing malware is labor-intensive, so antivirus companies have automated the analysis process in order to keep their databases reasonably up to date.
If it was no longer possible to analyze malware samples automatically, the sheer number of new malware strains would quickly render the signature database hopelessly out of date.
The bad news
"We demonstrate techniques that, if widely adopted by the criminal underground, would permanently disadvantage automated malware analysis by making it ineffective and unscalable."
The above quote is from the introduction to a research paper by Chengyu Song and Paul Royal, researchers at Georgia Tech's Information Security Center. To drive the point home, the researchers ended their paper with this:
"Flashback's use of an infected system's hardware UUID as a decryption key demonstrates that malware authors have already begun using protections like those described in this paper."
Flashback, you may remember caused quite a stir. It was the first real threat to Mac operating systems. More importantly, it employed groundbreaking encryption techniques.
This paper by Daryl Ashley of University of Texas describes in detail how portions of Flashback's malcode were encrypted using the computer's Universally Unique Identifier. That's not unheard of, but using encryption techniques to make the obfuscated code uniquely licensed to a specific computer is.
Host Identity-based Encryption
You may recognize the new obfuscation technique called Host Identity-based Encryption (HIE). It's similar to what movie and music industries use to prevent copying. Let's take a look at how it works.
Using any one of a number of techniques, the malware loader gains a foothold on a vulnerable computer. Once entrenched, the malware loader collects information specific to the computer under attack. Using the collected information, the malware loader then creates an encryption key that will encrypt key portions of the malware payload.
Next step is to load the malware payload.
The malware loader proceeds to collect the very same information and derive the exact same key. Now here is where it gets a bit weird. The malware loader then uses the key to decrypt the malcode, which then installs and goes about its dirty work.
I say weird, as it seems silly to first encrypt, then turn around and decrypt. After some head scratching, I realized it's not silly at all; gathering information, creating the key, and encrypting the malcode does not trip any alarms. And more importantly, during installation, if an antivirus program catches the malcode, it's no big deal.
All that was captured is malcode specific to the victim computer. The malcode will not execute anywhere else, because decryption fails. Meaning, the expert who is stuck reverse-engineering this type of malware must first figure out how to decrypt the code. Song and Royal's paper expands on the advantages afforded by HIE:
- It uses modern cryptography. Knowledge of how a key is derived does not aﬀect the integrity of the protection. Unless the defender can guess the same decryption key, they cannot unlock the sample.
- Any two samples of malware will possess diﬀerent decryption keys, meaning the intelligence gathered from successfully analyzing one malware instance provides no advantage in analyzing the second.
What information is used to create keys?
The researchers realized information gathered to create the encryption key is up to the malware developer. To prove their theory, they used the following identifiers in their tests:
- Environment Block: When a process is created, Windows stores environment information in the process' address space. In our design we use the process owner's username, computer name, and CPU identiﬁer. As the environment block is directly accessible by code that executes inside a given process, this information can be easily obtained.
- MAC address: The MAC address of the NIC can be obtained from the GetAdaptersInfo API.
- Graphics Processing Unit (GPU) info: GPU information can be obtained from the GetAdapterIdentifier method of IDirect3D9Ex interface. In our design, we use the device description.
- User Security Identifier (SID): Using the token of a process, the GetTokenInformation API can be used to obtain the SID of the process' owner. This identiﬁer is unique across a Windows domain.
More bad news
If that's not bad enough, earlier this year Dancho Danchev penned an interesting blog for WebRoot, pointing out when it comes to obfuscating malcode, malware developers are hard at work. Here are two examples:
- Fully Undetectable cryptors: Tools designed to mask malware, preventing computer security programs from discovering the malware. The idea is to keep changing the cryptor until the malware becomes unrecognizable to antivirus scans.
- Server-side polymorphism: Malware that morphs its appearance each time it runs. And remote servers control the mutations, preventing any examination by security companies.
As I read through the paper, I began wondering if the two researchers provided a solution. They did offer the following suggestions:
- Analyze the malware in the original environment such as honeypots.
- Collect host and network environmental information and duplicate it on a controlled computer.
The researchers seem to talk themselves out of any other solutions as they presented them -- not a good sign.
I didn't have any intention of turning the issues surrounding traditional malware into a series. But, this is information all of us need to be aware of. HIE-based malware, like Flashback and now Gauss, is out there and knocking at the door.
Also, my alluding to the best defense being user education in the first post is now more relevant than ever. We can't let malware gain that initial foothold...period.
I'd like to thank Chengyu Song, Paul Royal, Dancho Danchev, and their respective organizations for shedding light on the failings of traditional antivirus.