University of Detroit Mercy
Malware is any type of computer software harmful to computers and networks. The amount of malware is increasing every year and poses as a serious global security threat. In this paper, the authors propose a new method that adopts a collective learning approach to detect unknown malware. Collective classification is a type of semi-supervised learning that presents an interesting method for optimizing the classification of partially-labelled data. In this way, they propose here, for the first time, collective classification algorithms to build different machine-learning classifiers using a set of labelled (as malware and legitimate software) and unlabelled instances. They perform an empirical validation demonstrating that the labelling efforts are lower than when supervised learning is used, while maintaining high accuracy rates.