on Feb 8, 2007
I mentioned before that I always follow links to people who link to me. Last week I visited one such site, and a lone captcha appeared in the middle of the page, with nothing else. “This is strange”, I thought, “This idiot has their entire site behind a captcha? That can’t be good for traffic.”
Now I was really curious as to what sort of freak would do such a thing. I entered the captcha, but somehow messed it up. Another one appeared. This is why I hate these things, dangit! I tried again. I managed to mess it up again. I tried a third time, and when it failed I suddenly realized I’d been had.
Every captcha used a different scheme. This should have clued me in that something was wrong. The page didn’t have any other text, which also should have clued me in.
I’m sure I was entering them properly. What I believe I was seeing was part of a spamming mechanisim. First, the comment spam program runs into captchas. It then lifts the image and presents it to a human like me, and waits for me to tell it what the captcha says. I’m sure every time I entered a valid captcha I was causing a spam to appear somewhere, for someone. Once it finds a willing dupe like me, it will keep showing them captchas until they get bored or give up.
(It might not have been a weblog comment spam program. It might have been making lots of user accounts on a forum, which would then be used for spamming.)
Evil, but I give them full points for creativity. They can’t hope for much traffic like that, but they weren’t really trying that hard. If they really wanted to defeat captcha in volume, they could harvest some boobie pics from around the net, make users enter captchas to move from one image to the next, and then post the results to FARK. That would give them all the captchas they could ever need.
I hope the people that design captchas learn from this: The current generation of captcha-creation is overkill. Purple text on a red background with blue dots over it with the characters “1lIjt”, all rotated at different angles and overlapping one another, and then run through a wobbling distortion filter? Are you serious? Sure, nobody would ever dream of writing software to defeat that, but half the human beings in the world can’t be expected to get it right on the first try, either.
I doubt the spammers are even trying to keep up with them. Honestly, I’m sure just rotating the letters 45o is more than enough protection to defeat OCR attempts. There would be no reason for spammers to struggle with the OCR, when they could just use the trick I outlined above.
Shamus Young is an old-school OpenGL programmer, author, and composer. He runs this site and if anything is broken you should probably blame him.