Dealing with Node-sets and XPath in .NET

Album Cover: Plans

"And it came to me then that every plan is a tiny prayer to Father Time."
Death Cab / What Sarah Said

Posted on August 28, 2005 7:07 PM in Programming
Warning: This blog entry was written two or more years ago. Therefore, it may contain broken links, out-dated or misleading content, or information that is just plain wrong. Please read on with caution.

Anyone who has done any XML-related development in C# knows that Microsoft has done a lot to make dealing with XML a straightforward task in the relatively new language. That being said, there are certainly some weak areas that have caused many a developer (myself included) to bang their head on their keyboard a few times.

The most recent weakness I've come across is dealing with node-sets returned from XPath queries. Sure, it could be argued that XPath is most commonly used as a method for extracting granular data from an XML document, but even so there are cases when a developer wants to isolate a subset of nodes from an XML document and do things such as printing, manipulating, or passing the particular node-set on to another function.

Unfortunately, as nice as the XPathNavigator/XPathNodeIterator model is for extracting data efficiently from an XML document, it's frustratingly unintuitive for dealing with XPath queries that return node-sets rather than textual or numerical data.

Logically speaking, when you know your node iterator object is pointing at a node-set, if you want to do something as simple as printing the node-set you have matched, it seems like you should have immediate access to some variation of an OuterXml() function directly from your node iterator object. However, this is not the case in .NET. The reasoning most likely being that providing access to such a function will cause efficiency to take a major hit. That being said, it surprises me that this reasoning is not better documented. Are there really not many people out there wishing to work with node-sets as opposed to leaf-level data?

After muttering a few threats to set the building on fire under my breath, I eventually found a post in Google Groups that gave me access to knowledge that I fear too many developers facing the same problem never obtain.

After learning of the existence of the IHasXmlNode interface (again, something completely buried in the documentation) and then reading the following:

Whether or not the XPathNavigator implementation is a DocumentXPathNavigator or something else depends on what the XPathNodeIterator is iterating over in the first place – if it originally came from an XmlDocument then it will be DocumentXPathNavigator, but if you're iterating something else, then the current node implementation will be different.

I realized that accessing a node-set returned from an XPath query in .NET is almost as difficult as making it through Ishtar. Not only do you have to know about the IHasXmlNode interface, but you also have to make sure your XPathNavigator object derives from a XmlDocument rather than a XPathDocument, which is completely unintuitive.

The following is an example of the correct way to print a node-set using XPath in .NET:

// load the XML document
XmlDocument doc = new XmlDocument();
doc.Load("somefile.xml");

// get ready to navigate/iterate
XPathNavigator nav = doc.CreateNavigator();
XPathNodeIterator iter;

// retrieve a node-set via XPath
iter = nav.Select("/Some/Nodeset/Somewhere");

// make sure a match was found
if (iter.MoveNext())
{
  // get the node-set (contained in a node)
  XmlNode node = ((IHasXmlNode) iter.Current).GetNode();

  // print the node-set to the console
  Console.WriteLine(node.OuterXml);
}

Now whether it's efficient or not is another story, but hey, it works. And that, my folks, is worth its weight in gold.

Comments

No one has added any comments.

Post Comments

If you feel like commenting on the above item, use the form below. Your email address will be used for personal contact reasons only, and will not be shown on this website.

Name:

Email Address:

Website:

Comments:

Check this box if you hate spam.