Mystified by Altered URIs

Album Cover: Narrow Stairs

"As the flashbulbs burst, she holds a smile like someone would hold a crying child."
Death Cab For Cutie / Cath...

Posted on April 20, 2008 9:06 PM in Web Development
Warning: This blog entry was written two or more years ago. Therefore, it may contain broken links, out-dated or misleading content, or information that is just plain wrong. Please read on with caution.

While taking a look at my referrer logs tonight, I decided to see what some of the most common URIs are for causing my web server to return 404 errors. What I found has mystified me a bit.

In the list of bad URIs, I see pretty high numbers associated with some of my most recent posts that make use of my new post slug format:

  • /2008/04/14/high-quality-videos-on-youtube
  • /2008/04/15/call-me-a-snob-too
  • /2008/04/15/self-interest
  • /2008/04/09/take-it-off

Those are just a few examples. What seems odd to me, though, is that I've striven to keep all of my new post slugs unambiguous, and I don't see a single place on my site where I refer to the posts without the trailing slash. So where are these referrals coming from?

Unfortunately, the version of AWStats that I'm using doesn't do a very good job of telling me where all my 404-related referrals are coming from, so I'm left wondering if there's a feed aggregator out there that is stripping off trailing slashes for the fun of it.

As a stop gap measure, I've added a new RewriteRule to look for the entries missing the trailing slash, add one, and let the offending UA know that it should regard the redirection as permanent. Unfortunately, though, this isn't a true permanent move, because the original URI wasn't ever meant to exist in the first place.

Comments

Ian Clifton on April 20, 2008 at 9:33 PM:

Sounds like you've come up with the best fix you can, for now at least. You should take a look at Google Analytics for more stats-goodness than you'll ever need. Have you looked through the server logs for referrers to the bad URIs?

Permalink

Bernie Zimmermann on April 21, 2008 at 1:04 AM:

Not yet, Ian, but taking a look at my server logs is a good idea.

The only thing keeping me from Google Analytics thus far has been the number of times I've visited sites that seemed to take longer to load due to what appeared to be real-time analysis (kind of like that JavaScript loading you typically run into when a site serves up a lot of Google ads) occurring.

Permalink

Zim on April 21, 2008 at 9:25 AM:

I used chCounter some time. It's easy to implement (the official site is in german, but the script is in english). You insert a code snippet in your site, and the counter tracks page hits, sites and search engines where people came, and other common stats information.

Permalink

cxegynws on May 15, 2017 at 5:22 AM:

Post Comments

If you feel like commenting on the above item, use the form below. Your email address will be used for personal contact reasons only, and will not be shown on this website.

Name:

Email Address:

Website:

Comments:

Check this box if you hate spam.