A New Anti-Spam Measure?

Album Cover: In Rainbows

"Words are a sawed-off shotgun."
Radiohead / Jigsaw Falling Into Place

Posted on November 14, 2005 7:16 PM in Blogging
Warning: This blog entry was written two or more years ago. Therefore, it may contain broken links, out-dated or misleading content, or information that is just plain wrong. Please read on with caution.

I've noticed another influx of crud from spammers and would-be spammers on my site lately. I'd say I get about 2-3 comments a week that are obviously from bots trying to get some Google juice by linking to really obvious (but nonetheless obnoxious) things like Phentermine and Texas hold 'em. I've also noticed quite a few really stupid attempts at sending email via my Comments form. Good luck with that one.

Thanks to the new "Recent Comments" section in the subheader of my site, though, I can run a pretty tight ship around here when it comes to staying spam-free. That doesn't mean I wouldn't prefer to keep the site running cleanly on its own merit, though. I am sure I waste at least an hour a week checking for spam and cleaning it up when necessary.

Thanks to a combination of the desire to automate the anti-spam measures around here as much as possible and the slight case of OCD that I've brought up here before, I came up with a scheme that I think might just work. In fact, I think it might work for others as well, which is why I'm writing about it here.

What I've decided to implement is a sort of private handshaking scheme. In this scheme, all of the forms on my site (Comments, Contact, etc.) would include a hidden form field that looks something like the following:

<input type="hidden" name="key" value="dfe254ab34" />

The semantics are arbitrary, given that I haven't actually implemented this technique on my site yet, but the idea is that all but the value attribute would be static, and the value attribute would be generated dynamically at runtime. I've used a made-up hex code in my example to show how obscure it can be, but really the key can be anything I (or you) decide it should be.

The key to this approach, though, is that the acceptable value of the key is always changing based on the time of day, the day of the week, the month, or whatever else I want to generate it from. There will be only one valid key for any given span of time, and only the internal code (in my case, PHP code) will be aware of what this key is.

So between 3:30pm and 4:15pm on a Friday, the only acceptable key might be dfe254ab34, but between 4:15pm and 5:00pm the key would change to something else...maybe cdb164ee8f. Because I can call the same PHP function to generate the key both when I generate the original form and when I validate a submitted form's contents, the only way anyone can spam my site programmatically is if they actually come to my site and enter the values in the form themselves. Spammers are pretty persistent, but most aren't that persistent.

I probably haven't done the best job explaining what it is that I have in mind, but for some reason or another I really feel that it is going to cut down the amount of comment spam I've been seeing lately. If anyone has questions about the approach or sees any obvious holes, I'd love to hear. I can already tell that this approach will put a limit on the amount of time someone can spend filling out one of my forms, but if someone takes more than 45 minutes or an hour filling out one of my forms, maybe I don't want to hear from them anyway ;)

I'll doubtless be posting here again once I've put my new anti-spam measure into action and have a chance to see how it performs.

I am such an idiot.


Diego Pires Plentz on November 14, 2005 at 10:01 PM:

Hahahahaha. Btw, why you use <blockquote class="code"> + <del> instead of, er, <code>?


Bernie Zimmermann on November 14, 2005 at 10:42 PM:

Diego, since I've got a little spare time on my hands after realizing I'd be wasting my time implementing my "big idea," I went in and updated my stylesheet to handle the <code> tag that you suggested. However, I used it to replace what used to be <span class="code"> because that more appropriately maps to the <code> tag's purpose. The reason I also use <blockquote class="code"> is so that I can set a large chunk of code apart from the main page content and have it appear in a similar manner to my normal <blockquote> sections.

Thanks for your suggestion. I'll try to use <code> for inline code styling from this point forward. I'm glad I have eyes like yours making sure I don't miss anything!


Post Comments

This post has been locked. Commenting is no longer allowed.