Download now Free registration required
Malware detection is an important problem today. New malware appears every day and in order to be able to detect it, it is important to recognize families of existing malware. Data mining techniques will be very helpful in this context; concretely unsupervised learning methods will be adequate. This paper presents a comparison of the behaviour of two representations for malware executables, a set of twelve distances for comparing them, and three variants of the hierarchical agglomerative clustering algorithm when used to capture the structure of different malware families and subfamilies. They propose a way the comparison can be done in an unsupervised learning environment. There are different conclusions they can draw from the whole work.
- Format: PDF
- Size: 123.8 KB