Better Naive Bayes Classification for High-Precision Spam Detection
Source: John Wiley & Sons
Email spam has become a major problem for Internet users and providers. One major obstacle to its eradication is that the potential solutions need to ensure a very low false-positive rate, which tends to be difficult in practice. The authors address the problem of low-FPR classification in the context of naive Bayes, which represents one of the most popular machine learning models applied in the spam filtering domain. Drawing from the recent extensions, this paper proposes a new term weight aggregation function, which leads to markedly better results than the standard alternatives.