Collaborative Email-Spam Filtering With the Hashing-Trick
Source: Polytechnic Institute of NYU
This paper delves into a recently proposed technique for collaborative spam filtering that facilitates personalization with finite-sized memory guarantees. In large scale open membership email systems most users do not label enough messages for an individual local classifier to be effective, while the data is too noisy to be used for a global filter across all users. The hybrid global/individual classifier is particularly effective at absorbing the influence of users who label emails very differently from the general public -because of strange taste or malicious intent.