Classification of Textual E-Mail Spam Using Data Mining Techniques

A new method for clustering of spam messages collected in bases of anti-spam system is offered. The genetic algorithm is developed for solving clustering problems. The objective function is a maximization of similarity between messages in clusters, which is defined by k-nearest neighbor algorithm. Application of genetic algorithm for solving constrained problems faces the problem of constant support of chromosomes which reduces convergence process. Therefore, for acceleration of convergence of genetic algorithm, a penalty function that prevents occurrence of infeasible chromosomes at ranging of values of function of fitness is used.

Provided by: Hindawi Publishing Topic: Big Data Date Added: Sep 2011 Format: PDF

Find By Topic