Spam Decisions on Gray e-Mail Using Personalized Ontologies
E-mail is one of the most common communication methods among people on the Internet. However, the increase of e-mail misuse/abuse has resulted in an increasing volume of spam e-mail over recent years. As spammers always try to find a way to evade existing spam filters, new filters need to be developed to catch spam. A statistical learning filter is at the core of many commercial anti-spam filters. It can either be trained globally for all users, or personally for each user. Generally, globally-trained filters outperform personally-trained filters for both small and large collections of users under a real environment. However, globally-trained filters sometimes ignore personal data. Globally-trained filters cannot retain personal preferences and contexts as to whether a feature should be treated as an indicator of legitimate e-mail or spam.