Spam: It’s what’s for dinner. And breakfast. And lunch. And every snack in between. Wherever you turn these days, spam is invading inboxes everywhere, quickly making the jump from an annoyance to major business problem. Spam is much more operating system-agnostic than many e-mail viruses, so you can find a host of anti-spam solutions for a variety of products on a variety of platforms.
One solution for UNIX and Linux mail servers is DSPAM, which acts as the local delivery agent for the server and learns to recognize spam to ease the administrative burden of constantly keeping up with blacklists. DSPAM uses a Bayesian statistical analysis to improve the success rate and reduce the percentage of false positives.
What's Bayesian analysis?
"Bayesian," according to Merriam-Webster Online, is “being, relating to, or concerned with a theory (as of decision making or statistical inference) involving the application of Bayes' theorem and the use of probabilities based on prior knowledge and accumulated experience.” Simply put, DSPAM uses an analysis of past results to continually improve its spam-detection rate, resulting in a higher success rate as time goes on.
DSPAM requires a mailer agent that is capable of using a configurable local delivery agent and the Berkeley DB4 database. The Berkeley DB4 database is an easy installation, and full instructions are provided in its accompanying README file. As of this writing, the current version of DSPAM is 2.6.3, and you can download it here. Let's walk through the process of installing and configuring DSPAM.
My lab configuration
For this article, I am using Red Hat 9 and my mail server is Sendmail.
First, download the latest version of DSPAM from the link above. For my example, the filename is dspam-2.6.tar.gz. From the directory where you have saved the download, execute the following command to expand the distribution:
gunzip -dc dspam-2.6.tar.gz | tar xvf -
Now, change to the expanded directory with the command dspam-2.6. You can build the configuration for DSPAM using a typical configure command with the options shown in Table A.
|—with-local-delivery-agent=[mail program]||Use the program specified as the local mail delivery agent.||Depends on your system.|
|—with-userdir=[user directory]||Specify the directory where user dictionaries, signatures, etc. should be stored.||/etc/mail/dspam|
|—with-signature-life=[# of days]||The number of days for the signature life.||14 days|
|—with-db4-includes=[Location of DB4 includes]||Where to find Berkeley DB 4.1.x headers||Depends on DB4 install.|
Since I did a typical install using Sendmail, I could use the following command to begin the installation process:
I included the path to the DB4 includes to make sure that the configuration script could find them. Unfortunately, on my Red Hat Linux 9 system, the configuration failed with an error relating to the Berkeley DB 4 libraries, even though I provided the location to find them. After finding the source of the error and visiting the helpful user discussion forums at the DSPAM Web site, I issued the following command before executing the configure script again:
export LDFLAGS='-Wl,—rpath -Wl,/usr/local/BerkeleyDB.4.1/lib -Wl,—library-path -Wl,/usr/local/BerkeleyDB.4.1/lib'
The LDFLAGS variable passes options that will be used during the configuration phase of the installation.
Once the command prompt comes back and there are no errors, compile DSPAM using the make command. To install the compiled binaries into their final location, execute make install. This step needs to be performed as the root user. After this completes successfully, DSPAM is ready to be used by your mail program.
Changes to the Sendmail configuration
Once DSPAM is installed, you need to modify your Sendmail configuration to use DSPAM as the local delivery agent. Doing this will force mail through the DSPAM engine so that it can do its job.
Changing the local delivery agent to the DSPAM executable is accomplished by modifying the Sendmail configuration file, sendmail.cf. Be sure to make a copy of sendmail.cf before changing it.
To make DSPAM active, find the line at the bottom of sendmail.cf labeled Mlocal. If you are not using procmail, the first option after Mlocal will read something like P=/bin/mail. In this case, replace the contents of the Mlocal line with the following:
Mlocal, P=/usr/local/bin/dspam, F=lsDFMAw5:/|@qfSmn9, S=EnvFromL/HdrFromL, R=EnvToL/HdrToL,
A=dspam -d $u
If you are using procmail, which is identifiable by looking at the original Mlocal line, you need to use a slightly different configuration. With procmail, the first configuration option on the Mlocal line will read P=/usr/bin/procmail, and you will replace the contents of the Mlocal line with the following:
Mlocal, P=/usr/local/bin/dspam, F=lsDFMAw5:/|@qSPfhn9, S=EnvFromL/HdrFromL, R=EnvToL/HdrToL,
A=dspam -t -Y -a $h -d $u
If you installed DSPAM to a different location, provide that location in place of /usr/local/bin/dspam.
Adding mail aliases
DSPAM works by having the user forward spam to a unique account that is just for this purpose. For each user who you want to use DSPAM, you need to add a spam alias to the aliases file, which is typically located in either /etc or /etc/mail. On my Red Hat 9 system, it is in /etc.
Use a text editor to edit this file and add an entry similar to the following for each user:
spam-slowe: "|/usr/local/bin/dspam -d slowe —addspam"
The first part, spam-slowe, is simply an existing user ID with spam- as the prefix. The second part, |/usr/local/bin/dspam, will pipe mail received to this account through the executable you named (in this case, the DSPAM executable). The -d slowe portion indicates that the name of the dictionary is slowe. A separate dictionary is created for each use. Finally, —addspam indicates that the mail will be used to process future spam.
After you have added an alias to the aliases file, run the command newaliases to rebuild the aliases dictionary, aliases.db.
DSPAM with smrsh
If you are using a Sendmail system that uses smrsh (Sendmail restricted shell), you also need to add DSPAM's executable as a program that is allowed to be used by Sendmail. This is as easy as placing a link to the DSPAM executable in the smrsh configuration directory, which is typically /etc/smrsh. The following two commands accomplish this goal:
ln -s /usr/local/bin/dspam dspam
If you use smrsh and fail to do this, you will be unable to forward spam to the spam identification accounts, and DSPAM will be unable to learn its job.
At this point, you should have a working DSPAM/Sendmail system with appropriate aliases for your users. Now, if your users receive spam, they should forward it to the "spam-username" alias you set up for them. As DSPAM learns what kind of mail the user considers spam, it will eventually begin simply blocking the spam items. In general, DSPAM can begin blocking with fewer than 50 e-mails forwarded to the spam agent, but it takes 200 to 300 for it to be truly useful.
As a test, I sent a few e-mails to the root user's spam account on my lab system to see what kind of statistics DSPAM compiled. I can get details on DSPAM's statistics by executing /usr/local/bin/dspam_stats. For the root user, I got the following statistics:
root 0 TS 7 TI 1 TM 0 FP
This indicates that seven innocent messages and one spam miss have been recorded, while no spam messages have been caught, and there have been no false positives.
You need to perform some administrative tasks to keep DSPAM running efficiently and to keep it from gobbling up too much disk space. Each night, you should run a cron job that runs the dspam_clean program to clean the signature database. To do this, add the following line to the nightly cron job:
0 0 * * * /usr/local/bin/dspam_clean
Every five days or so, you should also run the dspam_purge program to optimize the user dictionary files. The following cron configuration will do the trick:
0 0 5,10,15,20,25,30 * * * /usr/local/bin/dspam_purge
Effective and free
DSPAM is not difficult to configure and maintain, and it can save an organization both the administrative hassle and the financial burden that is quickly mounting because of the massive amounts of spam that employees have to deal with. Best of all, DSPAM is free, making it much more economical to use than most other spam-fighting products.