August 2005

Album Cover: Flight of the Conchords

"She's so hot, she's making me sexist."
Flight of the Conchords / Boom

Dealing with Node-sets and XPath in .NET

August 28, 2005 7:07 PM

Anyone who has done any XML-related development in C# knows that Microsoft has done a lot to make dealing with XML a straightforward task in the relatively new language. That being said, there are certainly some weak areas that have caused many a developer (myself included) to bang their head on their keyboard a few times.

The most recent weakness I've come across is dealing with node-sets returned from XPath queries. Sure, it could be argued that XPath is most commonly used as a method for extracting granular data from an XML document, but even so there are cases when a developer wants to isolate a subset of nodes from an XML document and do things such as printing, manipulating, or passing the particular node-set on to another function.

Unfortunately, as nice as the XPathNavigator/XPathNodeIterator model is for extracting data efficiently from an XML document, it's frustratingly unintuitive for dealing with XPath queries that return node-sets rather than textual or numerical data.

Logically speaking, when you know your node iterator object is pointing at a node-set, if you want to do something as simple as printing the node-set you have matched, it seems like you should have immediate access to some variation of an OuterXml() function directly from your node iterator object. However, this is not the case in .NET. The reasoning most likely being that providing access to such a function will cause efficiency to take a major hit. That being said, it surprises me that this reasoning is not better documented. Are there really not many people out there wishing to work with node-sets as opposed to leaf-level data?

After muttering a few threats to set the building on fire under my breath, I eventually found a post in Google Groups that gave me access to knowledge that I fear too many developers facing the same problem never obtain.

After learning of the existence of the IHasXmlNode interface (again, something completely buried in the documentation) and then reading the following:

Whether or not the XPathNavigator implementation is a DocumentXPathNavigator or something else depends on what the XPathNodeIterator is iterating over in the first place – if it originally came from an XmlDocument then it will be DocumentXPathNavigator, but if you're iterating something else, then the current node implementation will be different.

I realized that accessing a node-set returned from an XPath query in .NET is almost as difficult as making it through Ishtar. Not only do you have to know about the IHasXmlNode interface, but you also have to make sure your XPathNavigator object derives from a XmlDocument rather than a XPathDocument, which is completely unintuitive.

The following is an example of the correct way to print a node-set using XPath in .NET:

// load the XML document
XmlDocument doc = new XmlDocument();
doc.Load("somefile.xml");

// get ready to navigate/iterate
XPathNavigator nav = doc.CreateNavigator();
XPathNodeIterator iter;

// retrieve a node-set via XPath
iter = nav.Select("/Some/Nodeset/Somewhere");

// make sure a match was found
if (iter.MoveNext())
{
  // get the node-set (contained in a node)
  XmlNode node = ((IHasXmlNode) iter.Current).GetNode();

  // print the node-set to the console
  Console.WriteLine(node.OuterXml);
}

Now whether it's efficient or not is another story, but hey, it works. And that, my folks, is worth its weight in gold.

Programming | Post Comments | View Comments (0) | Permalink

AWStats 6.5

August 27, 2005 10:10 AM

In this day and age of web syndication, bloggers like myself are becoming just as concerned (if not more) about our news reader readership as we are about our actual site visitors (the web designer in me cares about the latter quite a bit more, of course).

I've recently been tailing my web server's access log to try and get a feel for how many people are subscribed to my various feeds and what clients they are using. Then I thought, "I wonder if I could create a Perl script for analyzing this data and creating a report every week or so?"

Only a few days later I received an email from the AWStats mailing list letting me know that the beta of version 6.5 has support for "RSS catcher/readers in robot database." Granted, they could have been a bit more specific with their description, but I could only take this to mean that when 6.5 goes stable I will be able to see syndication statistics alongside my site visitor statistics.

I guess I should have seen this coming. Afterall, you couldn't ask for a better analysis tool than AWStats.

Blogging | Post Comments | View Comments (2) | Permalink

I Love Pearl Jam

August 25, 2005 10:51 AM

Pearl Jam, the band that just keeps on giving.

Music | Post Comments | View Comments (11) | Permalink

Flirting with Ajax

August 24, 2005 10:58 PM

I wasn't kidding when I said very soon. I've made yet another change to my search page, and while it may not be all that monumental for users, it is pretty monumental for me personally. Why? Because it's my first foray into the world of Ajax.

It's a real shame how long it takes me to go from a concept in my head to actually implementing it here on my blog, but better late than never, I suppose. Now for a little explanation of what it does, how it works, etc.

If you head on over to my search page, you'll notice that it looks the same as it always did. In fact, the old functionality is still there and did not change a bit (more on this a little later). You shouldn't notice any change until you start to enter search terms into the input box. If you type in something like "Mozilla Firefox" or "Asa Dotzler," you'll notice that search results are displayed below the form on-the-fly, all via the beauty of Ajax.

In order to implement the dynamic searching I followed along with Bill Bercik's Guide to Using XMLHttpRequest. I was able to use most of his code and only tweak a few things so that it was more appropriate for searching and updating search results in a div (rather than in some form element...which isn't as aesthetic). The "Google Suggest Hack" over at Basic AJAX Examples came in handy too, simply because it showed me how to update a div on-the-fly.

One of the great things about how I implemented this new functionality is that I didn't have to lose the old functionality in the process. This means that if you completely disable JavaScript in your browser (or for some crazy reason your browser doesn't support it), you'll still be able to type in search terms and hit the "Search" button to get results. I even made sure to wrap those results in the same div that encloses the Ajax results so that they can still be updated on-the-fly in the case that a user wants to mix-and-match searching methods.

All-in-all, this was an excellent exercise for me in that it gave me real hands-on experience with Ajax. I plan on working on a web project in the near future that will be very Ajax intensive, so having this experience under my belt should help a great deal. It may not add all that much to my site overall, but I think this new functionality combined with the full-text searching I added earlier tonight at least accomplish my goal of making my search page mo' betta, and mo' robust.

If you have comments, questions, or would like to see more of the technical details of how this was implemented, leave a comment or drop me an email.

Web Development | Post Comments | View Comments (2) | Permalink

Full-Text Searching in MySQL

August 24, 2005 6:45 PM

I've been meaning to upgrade my search page for quite some time now. The main reason being that my old algorithm was case-sensitive (uber annoying) and not even vaguely robust. This was due to the fact that I was simply plugging in the search term(s) to a LIKE query. Typically, MySQL queries are case-insensitive by default, but such is not the case when you are performing a LIKE query on a binary column of a type such as BLOB or TEXT.

Fortunately enough, in finally getting around to updating my search page I was able to take advantage of MySQL's relatively new full-text search functions. This is a really cool new feature that has been discussed in detail elsewhere.

Before I could implement full-text searching here on my blog, though, I had to tackle one minor caveat. Way back in the day, I made the mistake of defining the data column that contains all of my full blog posts as a BLOB rather than TEXT, and because of slight differences in the ways those data types are stored by MySQL, full-text searching is only available to columns defined as the latter. Luckily, switching from BLOB to TEXT was easy as pie (and I imagine the inverse is true). The task was accomplished with the following query:

ALTER TABLE blog MODIFY text TEXT;

Now that my data types were in compliance with the requirements for full-text searching, all I had to do was make the following alteration to my table of blog entries:

ALTER TABLE blog ADD FULLTEXT(title, text);

By doing so, I had let MySQL know that I planned on performing full-text search queries that involved those two columns. Now, my search page utilizes this new functionality via a search similar to the following example:

SELECT id, title FROM blog WHERE MATCH(title, text) AGAINST('firefox');

This has made my search page much more robust because not only is it no longer sensitive to case, but it also sorts the search results by relevance automatically. Pretty cool, huh?

I realize this probably comes as old news to a lot of techheads out there, but it's always exciting to see cool technology (even if it is a bit dated) applied to something so close to your heart (or brain, maybe?).

Look for mo' betta search capabilities here very soon.

Web Development | Post Comments | View Comments (0) | Permalink

Google Talk

August 23, 2005 7:54 PM

It's a little crazy to see the flood of visitors coming to my site today after looking up Google Talk Instant Messenger. Apparently, like me, they've read the various articles around the web today and are hungry for more details.

It sounds like we'll have more than just details by tomorrow, when Google is rumored to be scheduled for a release of their new IM software. If you haven't seen it already, Download Squad has released a full review of Google Talk (via Digg), including several screenshots of the minimal yet pleasing user interface. There are even reports of people who have made it onto the Google Talk network via Jabber already.

I was a little worried that the information people were finding on my blog was really out-of-date or misleading, but it's actually quite surprising to read just how close to reality the blogosphere was almost a full year ago.

I'm looking forward to playing around with Google Talk tomorrow. I'm sure I'm not the only one.

Computers | Post Comments | View Comments (1) | Permalink

One Year of Dog-Ears

August 20, 2005 7:31 PM

A year ago when I was making some changes to the functionality of my blog, I added the By Popularity section to my blog archives. Now that a year has passed, I thought it appropriate to take a look at what has transpired and make note of some of the dog-ears that have started to appear.

First off, The Difference Between Gray and Grey gets the "most popular" title. It's been interesting to track some of the comments that visitors have made regarding that post over the course of the past year.

On the other hand, there was the infamous Gmail Problems post that got so much unwanted attention that I had to make changes to my CMS to thwart it.

GrayModern was (and still seems to be) very popular, and so were XHTML issues and the differences between numerous things.

Probably the most surprising of my most popular blog entries, though, are the post about 'back door men', which seems to get a lot of attention from the anally inclined, and my post about sixth grade camp, which gets a lot of attention from people trying to figure out how to tie a "do rag."

It's been an interesting year for sure, and I'm looking forward to seeing if these trends continue or if some of my more recent blog postings stem the tide a bit. Look for another update in about a year or so (whether or not blogging is still considered cool at that point).

Blogging | Post Comments | View Comments (0) | Permalink

I Am Red Hat Linux

August 20, 2005 9:12 AM

I decided to follow Ryan's lead and take the Which OS Are You? quiz. I wasn't sure if I was going to post the result on my blog or not, but now that I've seen what OS I am, I couldn't resist.

You are Red Hat Linux. You're tops among your peers, but still get no respect from them.  It's all right with you.  You have your sights set higher.

Computers | Post Comments | View Comments (0) | Permalink

Serving Ads Unobtrusively

August 19, 2005 11:37 PM

Design tweaks aren't the only thing going on around here lately. After seeing how unobtrusively serving ads can be beneficial to both site visitors and a site's maintainer, I've decided to start serving ads (via Google AdSense) here on my blog.

Before you choke on your roast beef sandwich, let me first explain why and then tell you why it won't affect you.

I have plans for some web projects that are looming on the horizon, and if they are as successful as I believe they can be, the extra revenue will help curb some of the bandwidth costs I may run into. Plus it's kinda cool to make a little money doing what you love, especially when you can do it without affecting the people you care about most (blogospherically speaking, that is). That leads me to my next point.

Serving up ads can be done unobtrusively, and I believe the method I've come up with is pretty creative. As several visitors are probably aware, I have been keeping track of my most popular blog entries for roughly a year now (I will get into this in detail in a later blog post). By doing so, I've come up with a good list of entries that get a lot of traffic on a daily basis. By using a similar approach to the one I've employed in my archives, I can serve ads to the most popular blog entries so that "one-hit-wonder" visitors (the kinds of visitors that come in looking for information, find it, then never return again) potentially add to my ad revenue, and return visitors very rarely (if ever) have to see an ad.

The method I've come up with for determining if a blog entry is ad-worthy or not is to figure out if the entry is in the top 25% of my most popular entries. If it is, it is deemed ad-worthy and an ad will appear when the entry is viewed individually (never when it is displayed as part of a month, such as July 2005). If it isn't, then no ad is displayed. For an example of the difference, take a look at an ad-worthy entry and an entry that isn't (keeping in mind that these are good examples at the time of writing...due to the dynamic nature of my algorithm this may change over time).

By displaying ads on only the top 25% of my blog entries, loyal readers (who tend to read my blog entries within a few days of their posting) never see an ad, and most casual visitors never do either. The only people impacted by the ads are, as I mentioned earlier, the one-time offenders that come here looking for solutions to their Gmail problems or the origins of Andy Milonakis' fame.

In the end, I win out too because I don't affect my loyal readers in a negative way and yet at the same time my most frequently visited blog entries are serving up ads that may lead to revenue.

As is the case with most modifications I make to my CMS, I'll be reviewing this for a while to make sure things go smoothly. On top of that, your feedback is always welcome.

Blogging | Post Comments | View Comments (1) | Permalink

Time for a Refresh

August 19, 2005 10:00 PM

If you haven't noticed already (e.g. if you're reading this in a news reader like Bloglines), the design of my blog has been tweaked just a bit. The casual viewer might not notice all that much, but those who come back on a regular basis probably will.

First off, just to make sure all the dust gets brushed off, be sure to hit your browser's refresh button to take care of any annoying caching issues.

Now that you've done that, you'll notice that the header image looks just a bit different. This is because I've expanded the content frame out 100 pixels, to provide more real estate for content (and other things which I will bring up in a later blog posting).

Also, you'll notice that all of the fonts have been increased in size. This is because I feel it makes the content easier to read (although some of the aesthetics may have been sacrificed in the process). The sidebar has received an increase in width to go along with the font size increase so that the B-Sides have some room to breathe.

The other major change actually happened under the hood, and is the only change that required me wandering outside of the stylesheet realm. I updated the pages of my site to use more appropriate heading tags so that Google will understand them better. For instance, main page titles used to be contained within h5 tags (simply because I didn't know any better when I first came up with the new design) but now are more appropriately contained within h1 tags.

I've finished my tweaks for now, and I've tested the changes in a number of browsers. I've noticed that the site looks the best in Firefox on Linux, second best in Firefox on Windows, third best in Opera on Windows, and just plain bad in Internet Explorer on Windows. My site has always looked pretty bad in Internet Explorer, but the best part is I don't really care.

I do care about your thoughts on the changes I've made, though. If you have any feedback, positive or negative, please share here in the comments, or if you're shy, shoot me an email.

Look for more site changes in the coming weeks.

Web Design | Post Comments | View Comments (2) | Permalink

The Difference Between Sympathy and Empathy

August 19, 2005 7:40 PM

I've been meaning to look up the difference between sympathy and empathy for a while now, but apparently there isn't one. Answers.com lists empathy as a synonym for sympathy and vice versa.

If you ask Google, you'll most likely come across a site that contradicts itself when trying to point out the difference:

Empathy is entering into another's feelings. Sympathy is having a feeling together with someone.

...

Sympathy is when you feel bad for someone else. Empathy is when you feel bad with someone else.

Pay attention to the word "with" in those attempted explanations and you'll see the contradiction.

After finally looking into the matter, I've come to the conclusion that there really isn't any difference. That's a good thing, though – less work for my brain to do when coming up with the right words to say.

If you have your own idea of the difference between the two words, I'd love to hear.

Blathery | Post Comments | View Comments (68) | Permalink

Cell Phone Ubiquity

August 11, 2005 7:41 PM

In an article at News.com today called Ring tones that bite and zing, the following caught my eye:

Some traditional ring tones have become so commonplace that birds in Denmark, Germany and Finland are said to be mimicking them in their songs.

That is pretty crazy.

Blathery | Post Comments | View Comments (2) | Permalink

Forced Deja Vu in Firefox

August 02, 2005 10:04 PM

Let's say I'm minding my own business and I decide I want to download a mash-up of the Muppets, The Doors, and the Macarena. Still with me? Okay, so I click on the link that Mashup of the Week Podcast has so graciously provided and wait while the file downloads. When it is done, the mp3 plays in Winamp. So far so good.

Now let's say, just for giggles, that I like what I hear and want to save the mp3 I just downloaded to my hard drive. Okay, so I just right-click on the link and pick where I want to save it, right? Right. However, once I've chosen a location, Firefox starts downloading the mp3 all over again. What's up with that?

Basically, any time I listen to an mp3 before saving it to a non-temporary location on my hard drive (since it gets saved to a temporary location in order to listen to it with Winamp) I have to re-download the file before I can save it. If I try doing the same thing in Internet Explorer, that browser is smart enough to know that the file is already cached and to just copy it to wherever I decided to save it. Firefox, in this case, doesn't seem to be as smart as IE. How unfortunate.

If anyone out there has any ideas as to why this is, I'd love to hear. There's probably something in Bugzilla about it (at least I hope there is), but I'm too lazy to go bug hunting right now. I'm too busy listening to mash-ups.

Browsers | Post Comments | View Comments (0) | Permalink

Stupid Turkish Hackers

August 02, 2005 5:33 AM

The discussion forum over at my Coldplay site was hacked yesterday. When I paid a visit, I saw "Hacked by KavaLye Turkish Hacker" all over the place. According to Google, I'm not the only one either.

Unfortunately, as is the nature of such problems, when you Google for a solution you just find more evidence of the problem. I did a little investigation of my own and found that three of my phpBB database tables had been hacked: forums, categories, and config. Luckily, because I like to learn from other peoples' mistakes, I had a recent database backup to reference, and was able to restore all the correct table values fairly quickly.

After getting my database back in shape, I quickly headed over to the phpBB site to get the latest version of my forum software. It turns out I was about 9 point versions behind the latest release – not anymore.

The moral of the story? Turkish hackers are nuggets.

Oh, and remember to backup your databases.

Web Development | Post Comments | View Comments (25) | Permalink