I've noticed a steady increase in traffic to this site, mainly due to link backs from popular aggregators.
What was just a few dozen hits per day is now averaging several hundred...The internet truly is the worlds biggest soapbox.
Of course, with the extra attention comes a downside...comment spam, at one point last week I was receiving an average of ten spams per day.
The typical counter-response for spam is the ubiquitous captcha:
They border from the extremely simple (the one above can be cracked via a JavaScript neural network), to damn impossible:
My personal belief is that captchas are annoying for real users, and do a poor job of preventing spam, so I refuse to implement one.
Instead, I took a 3 step approach to eliminating comment spam on this blog, and it has worked fairly good so far.
1. Implement a honey pot to pick off poorly written bots
This trick involves placing a hidden text field in your form that should never have any content. Most bots fill out every input field on the form before submitting, so this picks off a large majority of them instantly.
I hid this by using absolute positioning and a negative margin instead of using display: none, as this makes it harder for a bot writer to analyze.
2. Compare the URL / Comment ratio in each comment.
Once a comment passes the honeypot, the next step is to analyze the ratio of URLs to content. If the comment is too full of URLs I reject it. This could cause some false positives...but on the upside, it forces users to write longer comments.
3. Force a 60 second delay between comments.
The final step is to simply prevent multiple comment submissions. This also prevents accidental double posts so it is a good feature to have in general.
I'm pleased to say that I've had this system running for several days now, and the comment spam has gone down to zero...without the pain that is a captcha.
We will see how it holds up.
Posted by Jonathan Holland on 2/15/2009.