Going Mini: Extreme Lightweight Spam Filters

Date Added: Jun 2009
Format: PDF

In this paper, the authors set out to determine if effective mini-filters could be trained for email spam filtering, using a drastically reduced feature set. The experimental results presented suggest that several methods, including boosting with early stopping, greedy decision lists, and TWFS methods all give effective, low-cost solutions to this problem. Furthermore, the initially appealing ideas of global information gain or decision trees of limited depth are shown to be less than ideal, especially under conditions of class label noise.