The truth about email spam

Spam is unsolicited bulk emailing, generally for commercial advertising purposes. It is the electronic equivalent of the junk mail you get from credit card companies like Capital One in your mailbox all the time. Email spam started to become a serious problem in the mid-1990s with the growth of the market for Internet-connected personal computers with integrated email capabilities.

Spam, or UBE, comprises somewhere in the range of 85% of the total email volume delivered over the Internet today. This is rightly regarded as a significant problem. A number of partial solutions to the problem exist, mostly involving heuristic and blacklist-based filters, which are nowhere near perfect. In addition to their inability to catch all spam, they also suffer from the problem of generating false positives: emails identified as spam, or email sources blocked as spam sources, that are in fact legitimate email or email sources. For most purposes, false positives are an even worse problem than little or no spam filtering at all. Without a reasonable guarantee that legitimate emails will get through, email is useless, no matter how clean of spam it may be.

Dire portents:

Some have taken this to mean that email is a dead medium. In his recent article in the Programming and Development weblog here at TechRepublic, Justin James laid out his case for the obsolescence of email in a series of bullet-points. Let's examine these arguments:

  • SMTP does not enforce any assurances that the server sending e-mail is authorized to act on behalf of the domain that the sender is from. There are add-ons for this like SenderID and the use of SPF tags in DNS, but they are hardly universal.

The fact that SMTP doesn't provide that authentication process is hardly a deal-breaker, though it is arguably a weakness in the protocol specification. As Justin points out, there are implementations of solutions to this problem, however -- and the fact none of them are universal may simply be a matter of the lack of maturity of the technique. Providing such server authentication should become a priority for webserver administrators everywhere in the future.

Claiming that the lack of universal deployment of such solutions is reason to regard email as a dead medium is equivalent to claiming that the lack of universal implementation of either Blu-Ray or HD-DVD as a high definition optical video storage format means we should never adopt high definition optical video storage media at all.

  • There is no requirement that encryption of the SMTP connection must always be available. Result: E-mail is open to snooping, which increases the cost and hassle of using add-on e-mail encryption products.

The implication here is that a specific encryption methodology should be built into the protocol. Protocols, however, should not mandate specific encryption methods. This would result in built-in protocol obsolescence, and the point of a protocol is to provide a standardized, reliable means of interoperability between nodes in a complex system. As security methods are obsolesced, anything inextricably tied to them will also become obsolete. Keeping encryption methods separate from the protocols that are protected by them so that they can be swapped out in modular fashion as needed is a key to the continued relevance of a given protocol's security model.

  • SMTP pushes the full message across to the destination -- as opposed to a system like RSS (or most NNTP readers) where it only pushes a notice or message header across -- to be picked up at a later time. Result: Wasted bandwidth.

The reason for this is that email is intended as a message-sending medium, and not a publishing medium. With a system like RSS, there are at least two separate single points of failure -- components of the system that can each, on their own, result in a failure of service. Either a failure to deliver the notification or a failure to provide it reliably when the would-be recipient tries to retrieve it results in a service failure, whereas with email the only opportunity for service failure is in initial delivery.

The bandwidth savings for widespread electronic publishing of data is sufficiently beneficial to justify it in contexts for which it is designed. The initial notification, however, is functionally equivalent to an email in itself. This means that breaking email up in two stages like RSS is equivalent to continuing to use email as it is, but requiring an additional step on the part of the recipient that is potentially at least as prone to failure as the first step. For personal communications, this trade-off between bandwidth and reliability is not a good one.

  • SMTP doesn't have any way of controlling or even monitoring the progress of the mail sent. Result: E-mail is not nearly as useful for business purposes as it should be.

Frankly, I'm not sure what Justin expects here.

  • SMTP does not offer any authentication, verification, or proof of identity.

This is, in effect, another perspective on the first bullet-point. It should not have been presented separately.

The source of the problem:

Ultimately, the problem of email spam is directly related to the reasons it is so cost effective as a means of advertisement. Spam email would not be nearly as beneficial to spammers if they had to send their own email -- the reason it is cost effective is that spammers aren't really doing the spamming.

Instead, massive spam botnets are doing all the hard work. Trojans and other infections are spread to millions of personal computers that are then tied together in a loose, distributed spambot network which can be used to send emails in bulk, in numbers that boggle the mind. The ease of infecting the systems used in these botnets is the real problem.

Authentication of the sender would reduce the ability of spammers to pretend to be a sender other than the infected system. Encryption would have no effect on spam at all. If anything, a savings in bandwidth would make spam easier and more common.

Eliminating the means by which spammers defer their costs to millions of unsuspecting home users of personal computers, however, would have a significant effect on the volume of spam.

The solution:

Replacing SMTP as a communication protocol will not eliminate spam. Even if spam over internet telephony is not likely to be a particular danger for a few years yet, that doesn't mean that a widespread replacement for email will be immune to the sort of spam problems that are such a major issue for SMTP.

Changing communication media will, if anything, probably only increase the ability of spammers to leverage those media for bulk commercial solicitation. This is because:

  1. For such a new communication protocol to catch on, it must at worst be no less convenient and economical than SMTP. That convenience and economy is part of what makes mass quantities of spam of the sort we see weighing down the Internet possible.
  2. A new protocol to replace SMTP would not solve the problem of deferment of resource consumption by spam from the spammers to the common home computer user, because it does not address the vulnerabilities that allow those computers to be recruited into spam botnets.

As such, the solution to spam is not replacing SMTP -- it is dealing with the epidemic of near-zero attention to security on home PCs.