How WordPress Post Slugs Work

Album Cover: The Future

"The maestro says it's Mozart, but it sounds like bubble gum."
Leonard Cohen / Waiting For The Miracle

Posted on January 10, 2008 1:39 AM in Blogging
Warning: This blog entry was written two or more years ago. Therefore, it may contain broken links, out-dated or misleading content, or information that is just plain wrong. Please read on with caution.

For a while now, I've been meaning to look into how WordPress, one of (if not the) most popular blogging platforms, handles post slugs. If you don't know what the heck a "slug" is, the WordPress Glossary describes it as follows:

A slug is a few words that describe a post or a page. Slugs are usually a URL friendly version of the post title (which has been automatically generated by WordPress), but a slug can be anything you like. Slugs are meant to be used with permalinks as they help describe what the content at the URL is.

Tonight, as you may have already noticed, I dove head first into the WordPress source code to find out exactly what those clever folks at Automattic (and their worldwide counterparts) are up to.

What I found is a very, very complex system for managing all the various configurable URL structures that WordPress users may choose to put to use at their blogs. However, at a very high level it is fairly straightforward to break down how they are doing what they do. So here goes...

First off, whenever you publish a new post using your WordPress dashboard, a sanitized version of your post title is stored in the posts (or wp_posts) table in your database along with the post's ID, the date of the post, your original post title, and several other post-specific pieces of information. The ID, date and sanitized version of your post's title, though, are the attributes of note in this particular exercise, so we'll focus on those.

To generate the sanitized version of your post title, it is sent through a function called sanitize_title_with_dashes() that lives (at the time of writing, at least) in the /wp-includes/formatting.php file. If you take a look at this function, you'll see that the WordPress folks have taken sanitation very seriously, looking out for anything from escaped octets (which are temporarily replaced and then restored) to accents to non-standard characters and symbols. In doing so, the function converts a post title like "How WordPress Post Slugs Work" to something that looks like "how-wordpress-post-slugs-work." This is the sanitized version that ends up in the posts table (in the post_name field).

Once your post has been sanitized and all of its information has been stored in the database, it is ready to be matched up against any forthcoming web queries against your web server that include the post slug, which likely includes some representation of the date (e.g. /2008/01/how-wordpress-post-slugs-work/). However, before WordPress-specific scripts get any chance at performing that matching, the web server must know to translate the post slug into a meaningful request that hands over the processing to WordPress. That's where an .htaccess file comes in.

As David Walsh so eloquently points out, WordPress' use of the .htaccess file is pretty ingenious. Here's how he breaks it down:

We must first establish that mod_rewrite is available on the server. If so, turn on mod_rewrite. Set the base of all rewriting to the web root folder [and] if the requested filename isn't a file...and it isn't a folder…send the person to "index.php."

Once inside the index.php file, the index.php file processes the request and presents you with the page based upon the slug in the URL.

So really, the .htaccess file is ingenious in its simplicity. It's inside the index.php file where the real magic kicks in. Or, if you've taken a look at the actual WordPress code, you know that index.php is really a meta-ish file that really pulls in a whole metric ton of other, well-organized but complex code that manages all the magical things WordPress can do from that single entry point.

To keep things focused though, we'll look next at how WordPress manages to translate the post slug into a single and correct post that can be returned to the UA and therefore displayed to the user.

Remember how I mentioned that the ID, date and sanitized title would be important later on? Well here's where those particular pieces of information become vital. In order to display the proper post, the code needs to be able to ascertain the post ID based on the post slug information alone. Since the post slug contains three important elements — the year, the month and the sanitized title — those can be used to generate a database query that will fetch the one unique post in the posts table that matches the user's request.

If you take a look at the get_posts() function in /wp-includes/query.php, you'll see something like the following:

$where .= " AND YEAR(post_date)='" . $q['year'] . "'";

This and a similar call for checking the month become part of a more detailed database query for fetching all the post information, which also includes a search for the sanitized post title during the associated time period (this allows for duplicate post titles that occur in different year/month combinations). Once the database query returns the one post ID associated with the original post, WordPress has all it needs to return the post-specific information to the UA, and then its job is done (until the next user interaction, that is).

So there you have it — a relatively high-level look at how WordPress handles post slugs from both the perspective of the publisher and the consumer. Please keep in mind that this is my understanding of the process based on about an hour's worth of digging through PHP source code, so it's not guaranteed to be dead on, especially for older releases of the software or versions that have yet to be released. However, as always, I invite you to leave a comment if you have any additional insight or corrections to share.

Comments

zach on March 14, 2008 at 8:27 PM:

thanks!

in /wp-includes/formatting.php

function sanitize_title_with_dashes($title)

requires:

function remove_accents($string)
function seems_utf8($Str)
function utf8_uri_encode( $utf8_string, $length = 0 )

Permalink

Burak Erdem on May 14, 2008 at 10:30 AM:

I was looking for a text slug function for my own CMS, and your post lead me the way. It works perfect :)

Thank you so much for sharing your thoughts..

Permalink

John on September 25, 2008 at 7:46 AM:

I didn't have the .htaccess file on my blog dir, so everytime someone clicked a link I posted, it gave a slug error. Your page helped me figure it out.

Thanks.

Permalink

vlad babii on January 21, 2009 at 1:33 PM:

I made a plugin that allows extensions. Alot of people asked me about it. Go to my site / experimente / vb-slug-allow-extensions. No need to hack wordpress files :D.

Permalink

Ross on July 23, 2009 at 6:49 AM:

Nice & thanks Zach exactly what i was looking for - why reinvent the wheel!

( 1 note... how come they are not being used on this site? :/ )

Permalink

Ross on July 23, 2009 at 6:51 AM:

just realised when using the navigation you are using them, but i found this page through google:

http://www.bernzilla.com/item.php?id=1007

Also your recent comments have urls like this...

Permalink

ahmed on October 05, 2009 at 8:15 PM:

nice work i really need this function

Permalink

Marein on March 05, 2010 at 11:44 AM:

I was looking all over for a way to get WordPress to leave out the dashes in the automatically generated permalinks (so far I was taking them out manually each time), and your pointer to formatting.php and the name of the function that I was looking for, was all I needed :) Thanks!

Permalink

Ahmad on May 25, 2010 at 4:11 AM:

You saved me a lot of time and effort. I was trying to know how this works and how the hell does Word Press ignore that ID part when constructing the SEO friendly URL. Thanks a lot

Permalink

100% Niche Relevant blog comments services on September 08, 2016 at 4:11 AM:

Much obliged to you for extremely usefull data..

Permalink

netflix on November 03, 2016 at 2:36 AM:

Totally awesome posting! Loads of valuable data and motivation, both of which we all need!Relay welcome your work. minecraft free premium account free netflix email and password

Permalink

paypal gift card on April 29, 2017 at 3:10 AM:

Get paypal money from the free payal money adder from the above mentioned website.

Permalink

clicker games on August 02, 2017 at 3:29 AM:

So luck to come across your excellent blog. Your blog brings me a great deal of fun.. Good luck with the site.

Permalink

Post Comments

If you feel like commenting on the above item, use the form below. Your email address will be used for personal contact reasons only, and will not be shown on this website.

Name:

Email Address:

Website:

Comments:

Check this box if you hate spam.