XML Namespaces, XML Schemas, and .NET

While updating Spring.NET today, I found something odd about how .NET handles XML Namespaces.

Lets say you have the following XML:

<?xml version=“1.0” encoding=“utf-8” ?>
<foo>
<bar id=“id1”>
<fubar>“id1 fubar1”</fubar>
</bar>
<bar id=“id2”>
<fubar>“id1 fubar1”</fubar>
<fubar>“id2 fubar2”</fubar>
</bar>
</foo>

Now, let’s say that I wanted to process all of the <bar> elements. You can read this pretty easily from .NET using the following:

XmlDocument doc = new XmlDocument();
doc.Load(“NamespaceSpike.xml”);

XmlNodeList nodes = doc.SelectNodes(“//bar”);
foreach ( XmlNode barNode in nodes )
{
Console.WriteLine(barNode.OuterXml);
}

Pretty simple huh? The xpath statement “//bar” simply selects all of the bar elements within the document. Now, what happens if you add a namespace to this document, as such:

<?xml version=“1.0” encoding=“utf-8” ?>
<ns:foo xmlns:ns=“http://www.tempuri.org”>
<ns:bar id=“id1”>
<ns:fubar>“id1 fubar1”</ns:fubar>
</ns:bar>
<ns:bar id=“id2”>
<ns:fubar>“id1 fubar1”</ns:fubar>
<ns:fubar>“id2 fubar2”</ns:fubar>
</ns:bar>
</ns:foo>

Now, if you try to run the C# code found above on the new xml file, you will notice that nothing appears. See how I added the “xmlns “ attribute to the top of the root element? That’s called a namespace. I am assigning all the elements prefixed with “ns:” to the “http://www.tempuri.org” namespace. This is similar to namespaces in .NET, in that is allows XML authors to avoid conflicts when creating their elements.

Now, that’s all well and good, but how do we work with this? Easy, simply change the C# code to this:

XmlDocument doc = new XmlDocument();
doc.Load(“NamespaceSpike.xml”);
XmlNamespaceManager mgr = new XmlNamespaceManager(doc.NameTable);
mgr.AddNamespace(“ns”, http://www.tempuri.org);
XmlNodeList nodes = doc.SelectNodes(“//ns:bar”, mgr);
foreach ( XmlNode barNode in nodes )
{
Console.WriteLine(barNode.OuterXml);
}

Notice that now we are creating an instance of the XmlNamespaceManager class, and passing it in when we call SelectNodes. This class simply resolves prefixes and namespaces when encountered in a XML file. Also, notice how I updated the the xpath statement to include the prefix we are searching for.

Easy huh? Now, here comes the hard part. This took me about 1 & 1/2 hours to figure out.

What do we do about default namespaces? For example, the following XML file is logically the same as the XML file above with the namespace prefixes, but this time, the namespace is set to the default of the document. Which means that a prefix doesn’t need to be supplied, as it is ‘assumed’:

<?xml version=“1.0” encoding=“utf-8” ?>
<foo xmlns=“http://www.tempuri.org”>
<bar id=“id1”>
<fubar>“id1 fubar1”</fubar>
</bar>
<bar id=“id2”>
<fubar>“id1 fubar1”</fubar>
<fubar>“id2 fubar2”</fubar>
</bar>
</foo>

Again, try to run the C# code against this xml file, and you will see that it works fine. How did that work out? You still have a reference to the “ns:” prefix from the previous XML file. So what’s going on?

Well, it turns out that xpath doesn’t understand default namespaces. So, for documents with default namespaces, you still need call AddNamespace() for the XmlNamespaceManager, but instead of passing in an empty string, as you would think, you simply have to pass in a random string. Then, just use the same random string in your xpath expression. That way, the xml file doesn’t need to change.

It’s strange, because I would have thought the default namespace would have a prefix of “”. But, it turns out that it can be anything, as long as it’s consistent when being referenced.