Using Visual Features for Anti-Spam Filtering
Source: University of California
Unsolicited Commercial Email (UCE), also known as spam, has been a major problem on the Internet. In the past, researchers have addressed this problem as a text classification or categorization problem. However, as spammers' techniques continue to evolve and the genre of email content becomes more and more diverse, text-based anti-spam approaches alone are no longer sufficient. This paper proposes a novel anti-spam system which utilizes visual clues, in addition to text information in the email body, to determine whether a message is spam. They analyze a large collection of spam emails containing images and identify a number of useful visual features for this application. They then propose using one-class Support Vector Machines (SVM) as the underlying base classifier for anti-spam filtering.