University of Louisiana
The metadata embedded in program executables provides information that can be useful for automated malware detection or classification. With potentially tens of thousands of variants per malware family, it is unclear how much consistency there is in the metadata, and whether different families exhibit different consistencies. Header information from multiple variants of recent malware was studied to understand the variability of the header information within and among malware families. Classification accuracy extracted using multiple common classifiers showed that, even for rapidly mutating malware families, classifiers trained on header information can outperform ones trained on the program bodies.