Maybe your random CAPTCHA string generator should be less random

CAPTCHAs can help protect your websites from spammers, but if you aren't careful, they can offend your customers too.

One of the ways websites often defend against the predations of spammers and the like is by way of pseudo-Turing tests. They present some kind of test intended to separate the spam scripts from the humans — what some would call a "reverse Turing test" because it involves a computer testing a human, rather than a human testing a computer. By far the most common type of such a test on the Web is a challenge-response test called a CAPTCHA, or "Completely Automated Public Turing test to tell Computers and Humans Apart".

The way the common CAPTCHA works is by presenting the user with a string of letters (and sometimes numbers or even other symbols), turned into a distorted image to defeat character recognition software. The humans visiting the site are presumably capable of recognizing letters even when distorted visually, so such humans can type the letters from the image into a form field and submit them. Spammer scripts, on the other hand, should fail. Striking the right balance between "too easy" and "too hard" can be difficult, though, which is why many CAPTCHA implementations come with a reload button that will present a new CAPTCHA image, in case the human cannot read the distorted characters in the first image.

To improve the difficulty for computer programs to pass the CAPTCHA, it of course makes sense to not use the same string of characters every time. For this reason, CAPTCHA images are often generated on the fly from randomly selected strings of characters or from "words" selected at random from a database. As discovered by a reader of The Register, this can lead to amusing results from time to time. The title of the article in The Register that reported such an event says it all: Google requires user to enter 'minge'. A bit facetiously, The Register complains about this:

What, our correspondent asks, if some cybergranny were confronted by an unexpected sight of minge while logging onto Gmail over a nice cup of cocoa? And the thought of innocent children being exposed to this filth, well, it just makes our blood boil.

Many American readers may not be familiar with the term presented in this CAPTCHA SNAFU. Suffice to say it is a slang term for a body part not possessed by the males of the species.

Depending on your audience, it may not be in your best interests to allow your website to present visitors with CAPTCHA terms that might be considered a little "off-color", any more than Research In Motion should have used "rim" as the domain name when establishing its jobs portal — mercifully short-lived, apparently — called Make sure you tidy up your randomness algorithms, or check your database of terms to ensure it does not contain anything too terribly offensive, before deploying.

Also, hire someone under the age of fifty from time to time, if only to see if they laugh when you present them with your brilliant idea for a product name.


Chad Perrin is an IT consultant, developer, and freelance professional writer. He holds both Microsoft and CompTIA certifications and is a graduate of two IT industry trade schools.

Editor's Picks