The evil spammers just keep on hitting our little blogging platform. I apologize to those of you who have had the nasty spam comments published on your blogs. The spammers are truly the filth of the internet, and we’re locked in an arms race against them.

They use automated scripts for the most part to publish comments containing links to their various casino/viagra/porn websites, trying to boost traffic to their sites and trick Google into listing them higher in search results.

I’ve been trying different antispam techniques – using the distributed Akismet spam system, the standalone WP-SpamFree system, and a few others, but nothing has been 100% effective at keeping the spamroaches at bay.

So now I’m rolling out a “captcha” system on a trial basis. Only anonymous (i.e., not logged in) people should see it at all. When it is displayed, the person (or spam script) will have to decode 2 images and type them as text before a comment will be accepted. For example:

recaptcha

In this case, a person wanting to publish a comment would have to enter the words “hearted Russians” before the comment would be accepted. If a person can’t read the words, they can click the arrow link to get a new set of images to decode. If a person just can’t read captchas at all, there is an audio alternative, which is available by clicking the speaker icon.

The reCAPTCHA project is interesting. From the reCAPTCHA website:

reCAPTCHA is a free CAPTCHA service that helps to digitize books, newspapers and old time radio shows. reCAPTCHA improves the process of digitizing books by sending words that cannot be read by computers to the Web in the form of CAPTCHAs for humans to decipher. More specifically, each word that cannot be read correctly by OCR is placed on an image and used as a CAPTCHA. This is possible because most OCR programs alert you when a word cannot be read correctly. But if a computer can’t read such a CAPTCHA, how does the system know the correct answer to the puzzle? Here’s how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct.

I’m philosophically opposed to captchas – they penalize everyone in an attempt to stop the spammers. I’m hoping that this plugin will strike a proper balance – only being displayed to people that aren’t logged in is a very good start.

Also, when creating an account on the site, people will have to decode the captcha. This is to prevent the automated “splog” scripts from creating spam blog sites to publish links – we currently get up to a dozen of these automatically created each and every day.