More Comment Spam Countermeasures

Album Cover: Abbey Road

"She's killer-diller when she's dressed to the hilt."
The Beatles / Polythene Pam

Posted on May 29, 2007 9:15 PM in Blogging
Warning: This blog entry was written two or more years ago. Therefore, it may contain broken links, out-dated or misleading content, or information that is just plain wrong. Please read on with caution.

When I consider this previous countermeasure, I think it's safe to assume that any of my remaining comment spammers don't read my blog. So working under that assumption, I think it's safe for me to actually post about some other countermeasures I've put in place to fight the comment spam.

One of the patterns I've observed here at my blog, thanks to spam recycling, is that certain spammers like to post nothing more than a single link as the body of their comment. I think it's safe to assume that real commenters will post nothing more than a link very rarely, if ever. Working under that assumption, denying all comments that look like a single link and nothing more is as simple as writing a regular expression (not that "simple" and "regular expression" really belong in the same utterance together):

// see if this is just a single link with no surrounding text
if (preg_match("/^<a.+?>.+?</a>$/s", $text))
{
  $smells = true;
}

Another pattern I've noticed is that certain spammers (especially those with a penchant for spamming the bedazzlers out of this post) enjoy posting a buttload of links in a single comment. Again, I think it's safe to assume that commenters of the living, breathing variety aren't very likely to post a laundry list of links in their comments. Therefore, a more advanced regular expression should do the trick:

// see if this contains five or more links
if (preg_match("/<a.+?>.+?</a>.*?<a.+?>.+?</a>.*?<a.+?>.+?</a>.*?<a.+?>.+?</a>.*?<a.+?>.+?</a>/s", $text))
{
  $smells = true;
}

Combine these two relatively straightforward blocks of code together, and you end up with a function for determining whether or not a comment smells like spam:

// function for spotting spam-ish comments
function smells_like_spam($text)
{
  $smells = false;

  // see if this is just a single link with no surrounding text
  if (preg_match("/^<a.+?>.+?</a>$/s", $text))
  {
    $smells = true;
  }

  // see if this contains five or more links
  if (preg_match("/<a.+?>.+?</a>.*?<a.+?>.+?</a>.*?<a.+?>.+?</a>.*?<a.+?>.+?</a>.*?<a.+?>.+?</a>/s", $text))
  {
    $smells = true;
  }

  return $smells;
}

I guess we'll see how it works. Well, hopefully you won't. But I will.

Comments

Zim on May 29, 2007 at 10:39 PM:

It's nice to read how you "evolve" your spam filter, the good thing is you learn about the topic and you really know the "enemy".
I'm using Akismet at my blog, and it does work pretty well.
Good luck!

Permalink

free money paypal on March 08, 2017 at 1:08 PM:

So glad to know about the website for the paypal money here that can give money directly to the account.

Permalink

transformice on October 09, 2017 at 12:46 AM:

Valuable info. Lucky me I found your website by accident. I bookmarked it. This article is genuinely good and I have learned lot of things from it concerning blogging. thanks.

Permalink

get help file explorer windows 10 on January 11, 2018 at 10:23 PM:

Really valuable information which you shared here and feel very nice after getting this online coding. You can continue and for your readers.

Permalink

Assignment Help Online on February 13, 2018 at 3:40 AM:

By reading this well-written post I consider that is valuable information Fortunately, I came to find your website. I noted this on my mark. This article is very good, I learned a lot about the blog. I am glad to read that, Thanks

Permalink

putlocker on February 13, 2018 at 4:12 AM:

Magnificent site you have here, so much cool data!..

Permalink

C# Programming Project Help on September 05, 2018 at 12:29 AM:

Get The Dissertation Writing Service Students Look For These Days With The Prime Focus Being Creating A Well Researched And Lively Content On Any Topic.

Permalink

Help With Economics Project on September 05, 2018 at 2:05 AM:

Great Info! I Recently Came Across Your Blog And Have Been Reading Along. I Thought I Would Leave My First Comment. I Don’t Know What To Say Except That I Have

Permalink

used motor grader cat 140h cca01812 for sale on October 10, 2018 at 10:50 AM:

Very good reading, I just went to a colleague to do some research.

Permalink

Post Comments

If you feel like commenting on the above item, use the form below. Your email address will be used for personal contact reasons only, and will not be shown on this website.

Name:

Email Address:

Website:

Comments:

Check this box if you hate spam.