SpamBlocker: Combining NLP and Genetic Algorithms to Improve Spam Defence Efficiency

Free registration required

Executive Summary

The problem of differentiating solicited email from spam email for millions of email users worldwide has still not been fully solved, in part due to variation in user preferences and idiosyncratic language use. Many anti-spam systems treat email text as generic and are inadaptable to individual user preferences. This paper presents an anti-spam solution which combines two computational techniques for an algorithm which applies a two-step filtering process: keyword extraction and a genetic algorithm to build a spam filter system for filtration based on user word profiles. The algorithm has been tested on a text classification task, preliminary results indicating higher performance than one-layer classification systems. It is aimed to implement the system as an email plug-in for Microsoft Outlook.

  • Format: PDF
  • Size: 104.37 KB