Spam has recently started garnering more attention because of its chokehold on network bandwidth. Brightmail, which sends out automated antispam updates many times a day, maintains a "Probe Network" of about 200 million e-mail boxes to collect spam. The numbers aren’t encouraging. According to the company, which extrapolates Internet-wide numbers from what it collects, spam represented as little as 8 percent of all e-mail in September 2001. In March, 2003, that number was 45 percent.
"The volume of spam is growing rapidly, having created a dire picture," said Marten Nelson, an analyst with Ferris Research. "However, on the positive side, there are technical solutions that more or less resolve the problem." Let's look at those solutions.
Spam solutions and secrets
You can combine several approaches to spam prevention to achieve the level of antispam protection your enterprise is after. These are the tools currently available and a few tricks used by spammers to evade detection.
The most basic approach to spam prevention is whitelists/blacklists. Blacklists are databases of addresses from which the enterprise won’t accept e-mails. Many free and constantly updated blacklists are available on the Internet. There are two pieces of good news here: Blacklists are still good enough to catch most spam (experts say 80 percent is a fair estimate) and the functionality to use white or blacklists comes loaded on most e-mail servers.
The bad news is that the remaining 20 percent of all e-mails that slip through is still a lot of spam. Moreover, in the Darwinian world of the Internet, tools used by the best spammers will slowly be adopted by the less adept, so the percentage of spam caught by blacklists is likely to drop.
Whitelists are lists of approved spammers that otherwise would be stopped by blacklists. If an enterprise uses a blacklist, but wants its employees to have access to a specific blacklisted spammer, it can put it on the whitelist.
Keyword searches look for specific words in addresses or the body of the e-mail. The obvious problem is that the same word can be in spam or a legitimate message. Keyword searches often are the first step in more elaborate heuristic filtering approaches. These approaches use a variety of methods to determine whether a message is likely to be spam.
With Bayesian analysis, a large amount of identified spam and an equal amount of legitimate e-mails undergo heuristic filtering. This analysis establishes a very precise threshold between likely spam and likely legitimate mail. Incoming mail is sorted accordingly.
Sieve filtering is a heuristic approach in which system administrators write, in real time, scripts based on recently received spam. If successful, these scripts will block subsequent instances of the same spam. So, for instance, if a new version of the Nigerian scam letter hits, the administrator can block all e-mail with the words “Nigerian,” “bank,” and other elements common to the e-mail.
The checksum antispam approach adds the ASCII values of all the characters in a message. That number is virtually certain to be unique. E-mails with the same total ASCII values are likely to be the same, which is a key spam indicator. The response by spammers is to insert random character strings in the spam, ensuring that the total ASCII values differ. Ken Schneider, Brightmail's CTO, said, "Antispam forces have begun designing programs that detect total ASCII values that are nearly the same for further evaluation."
Another example of the subtle war between spammers and antispammers—which also resembles the world of hacking, cracking, and viruses—involves cloaking. Spammers can evade word search programs by replacing certain letters with their ASCII values. The letter ultimately is displayed correctly on receivers’ screens but isn’t detected at the point it passes the filters. The presence of cloaking on your network is a sign that you have a spam problem.
In addition to determining what must be stopped, you must understand what should be let through. For instance, almost all industries are awash in e-mail newsletters, and many are legitimate and welcome by corporate workers. Many e-mail spam filtering programs, however, will recognize these newsletters as spam if they are deployed poorly.
So stopping spam is only half the goal. The answer is doing this with a marginal rate of false positives—legitimate e-mails that are taken for spam. "The absolute key…is if the antispam product is going to be accurate in terms of false positives," said Schneider. "If it’s the last day of the quarter and a sales contract ends up in a spam folder or deleted, heads are going to roll because the antispam solution damaged business." Ferris’ Nelson counsels managers to train employees to report false positives they find out about as a way to fine-tune the antispam software in place.
CIOs need to be proactive
CIOs and their staff must be savvy in how they approach the overall problem and the specific remedies that are available. Perhaps the best way to turn the tide against spam is supporting laws against it. Virginia—the home to e-mail kings America Online and UUNet—has made a good start toward this process by passing a tough new antispam statute. It remains to be seen whether the law is challenged and, if it stands, its effect.