Tracking Concept Drift in Malware Families
The previous efforts in the use of machine learning for malware detection have assumed that malware population is stationary i.e. probability distribution of the observed characteristics (features) of malware populations don't change over time. In this paper, the authors investigate this assumption for malware families as populations. Malware, by design, constantly evolves so as to defeat detection. Evolution in malware may lead to a non-stationary malware population. The problem of non-stationary populations has been called concept drift in machine learning.