Security

Maybe your random CAPTCHA string generator should be less random

CAPTCHAs can help protect your websites from spammers, but if you aren't careful, they can offend your customers too.

One of the ways websites often defend against the predations of spammers and the like is by way of pseudo-Turing tests. They present some kind of test intended to separate the spam scripts from the humans -- what some would call a "reverse Turing test" because it involves a computer testing a human, rather than a human testing a computer. By far the most common type of such a test on the Web is a challenge-response test called a CAPTCHA, or "Completely Automated Public Turing test to tell Computers and Humans Apart".

The way the common CAPTCHA works is by presenting the user with a string of letters (and sometimes numbers or even other symbols), turned into a distorted image to defeat character recognition software. The humans visiting the site are presumably capable of recognizing letters even when distorted visually, so such humans can type the letters from the image into a form field and submit them. Spammer scripts, on the other hand, should fail. Striking the right balance between "too easy" and "too hard" can be difficult, though, which is why many CAPTCHA implementations come with a reload button that will present a new CAPTCHA image, in case the human cannot read the distorted characters in the first image.

To improve the difficulty for computer programs to pass the CAPTCHA, it of course makes sense to not use the same string of characters every time. For this reason, CAPTCHA images are often generated on the fly from randomly selected strings of characters or from "words" selected at random from a database. As discovered by a reader of The Register, this can lead to amusing results from time to time. The title of the article in The Register that reported such an event says it all: Google requires user to enter 'minge'. A bit facetiously, The Register complains about this:

What, our correspondent asks, if some cybergranny were confronted by an unexpected sight of minge while logging onto Gmail over a nice cup of cocoa? And the thought of innocent children being exposed to this filth, well, it just makes our blood boil.

Many American readers may not be familiar with the term presented in this CAPTCHA SNAFU. Suffice to say it is a slang term for a body part not possessed by the males of the species.

Depending on your audience, it may not be in your best interests to allow your website to present visitors with CAPTCHA terms that might be considered a little "off-color", any more than Research In Motion should have used "rim" as the domain name when establishing its jobs portal -- mercifully short-lived, apparently -- called RIM.jobs. Make sure you tidy up your randomness algorithms, or check your database of terms to ensure it does not contain anything too terribly offensive, before deploying.

Also, hire someone under the age of fifty from time to time, if only to see if they laugh when you present them with your brilliant idea for a product name.

About

Chad Perrin is an IT consultant, developer, and freelance professional writer. He holds both Microsoft and CompTIA certifications and is a graduate of two IT industry trade schools.

12 comments
SmartAceW0LF
SmartAceW0LF

is to choose the icon of the wheelchair which is alledly there to provide an alternative way of authenticating on these sites. Be warned, if you do choose this option and successfully authenticate via this means, you are either 1.) Extra-Terrestial or 2.) Extraordinarily brilliant in cognitive aural abilities. 3.) If either of the first 2 hold affirmative for your efforts, you are definitely in the wrong job! and conversely, 4.) If you are left with your jaw open on the table wondering if there is anyone in the world that can truly understand what you just heard, then you have an intimate understanding of why I even bothered posting this.

jacobus57
jacobus57

Anyone who is a loyal South Park fan certainly knows what "minge" means, which is just one more reason to watch this brilliant cultural satire ;-) Second, Chrome has TWICE bungled passwords on TWO different Gmail accounts, and was also unable to verify the Captcha entries. Has anyone else had this issue?

ancientprogrammer
ancientprogrammer

At least when it comes to automated spam. Automated spam programs like XRumer and such use CAPTCHA breakers that are quite effective. However, they cannot deal with scripted components that well, such as those that time user entry and interaction with the page. This is more seamless to users, nothing extra for them to enter or to struggle with reading, and generally more effective in stopping automated spam programs.

douglas.gernat
douglas.gernat

Didn't know that terminology! Could be an interesting contest for funniest CAPTCHA images. I know I've come across some interesting ones!

cperry
cperry

Personally I think that tests would work better than just random letters or words. The most basic type is a common "what is 2+2?" but that can be easily programmed. But things like "Which picture below is the White House?" or "How many sheep are in the following picture?" works fairly well with a large enough database. With the question being in a CAPTCHA type image format rather than plain text, albeit much more readable than many CAPTCHA's. This would be very strong as it requires the bot to read the question using recognition software, then interpret the question and then answer it by somehow reading a picture (using facial recognition or something like that... not sure since each question would be different). This would also be much easier for humans to do since it's just a multiple choice click or a number or short answer. Typing CAPTCHA's on my phone with the on screen keyboard auto-correct is quite frustrating sometimes.

apotheon
apotheon

> CAPTCHA isn't the best solution anyway This is probably true. In fact, I touch on this idea in an article I wrote that is scheduled to be published in a week or so, as kind of a follow-up to this article. Keep an eye out for it.

apotheon
apotheon

It's difficult to provide accessibility support for people with disabilities using a scheme like that.

douglas.gernat
douglas.gernat

Quite a good concept! Somebody call the patent office!

bboyd
bboyd

I entered my password and now you want me to read a distorted odd font of random characters with missing chunks. I've typed it in but missed the capitalization of the lower case f with the missing serif. Please remind people I want to kick them in the . Oh wait I'm supposed to refrain from speech that may offend others. My favorite is currently two pictures each hyper-linked, sets of images with computers on one side, people the other. Asks me which one I am. To bad it doesn't handle sight impaired well.

apotheon
apotheon

No, you aren't the only person who gets them wrong. Lots of people get them wrong regularly. It depends on the CAPTCHA implementation, though. Some are much more difficult to read than others. The bad implementation at blogger.com, coupled with the fact that it requires the visitor to accept cookies from the site to validate the CAPTCHA response, is the reason I don't comment on anyone's Weblog entries there.

Editor's Picks