Collaborative EmailSpam Filtering With the HashingTrick

Download Now Free registration required

Executive Summary

This paper delves into a recently proposed technique for collaborative spam filtering that facilitates personalization with finite-sized memory guarantees. In large scale open membership email systems most users do not label enough messages for an individual local classifier to be effective, while the data is too noisy to be used for a global filter across all users. The authors' hybrid global/individual classifier is particularly effective at absorbing the influence of users who label emails very differently from the general public - because of strange taste or malicious intent. They can accomplish this while still providing sufficient classifier quality to users with few labeled instances.

  • Format: PDF
  • Size: 848.2 KB