2007
06.10

What with the big push to have XML adopted as a cross-everything data format, it seems like it should be fairly straightforward to parse an XML file in .NET.  For the most part, I would say that it is, but I ran across a situation recently that was a bit difficult to solve — mostly because of the lack of available information on the subject.
The problem is parsing an XML file that has namespaces embedded in the nodes.  Unusual, you say?

That’s what I thought until I came across the RSS iTunes feed from Apple.
It’s not totally clear to me why they want to embed namespaces in some of the nodes, but it is the way they do it, nonetheless.
RSS, as you probably know, is just a standard for formatting data delivered by content providers (e.g. Web services).  It has some basic, standard node names, but is also extensible so that a consumer of the data can extract the extra information if it is aware of it.
I won’t go into the details of RSS here, but basically, each item in the feed is sandwiched between <item> node tags (this is RSS 2.0), with associated <title>, <description> and other nodes.
In the case of the Apple feed, a number of the proprietary nodes are formatted with namespaces like so: <itms:artist>, <itms:artistLink>, etc.  I had never seen a node with a namespace before (don’t get around much, I guess), but it didn’t seem like it should be too hard to extract the data.  Boy, was I wrong !
Just finding out how to do this took hours of searching around on the web. I finally found a small example on a non-English web site or I would probably still be looking.  All the Microsoft examples of XML parsing I came across didn’t have any nodes with namespaces on them. PHP doesn’t have any trouble with the namespace qualified nodes, but C# is a very different story :)

I am going to present some small code samples to fix this problem that might save you a lot of trouble trying to figure out how to do the same thing. (I realize XSLT would be easier, but you might need to do it this way for some reason.)

So the Apple RSS feed (simplified) looks something like this:

<rss version=”2.0″  other stuff…>
<channel>
<item>

<title>
Title here
</title>

<itms:artist>
Artist name
</itms:artist>

<itms:album>
Album name
</itms:album>

<itms:releasedate>
Release Date here
</itms:releasedate>

<itms:coverArt height=”100″ width=”100″>
theImage.jpg
</itms:coverArt>

</item>

…More items here

</channel>
</rss>

The nodes with the <itms: > namespace are the proprietary ones.

I’m just trying to get the “theImage.jpg” string from the <itms:coverArt> node.  Simple?  Let’s see…

Loading the document is easy:

XmlDocument doc = new XmlDocument();
doc.Load("MyXML.xml");

Getting the item nodes is easy:
//RSS 2.0
XmlNodeList items = doc.SelectNodes("/rss/channel/item"); 

Now I have a collection of the “item” nodes.

Looping through them is easy:

foreach(XmlNode itm in items)
{
  try
  {
       //Here's the tricky part.
       //To get the data in the nodes with namespaces,
       //I have to jump through some hoops!

       XPathNavigator nav = itm.CreateNavigator();

       //Here I am setting up IXmlNamespaceResolver
       //to pick nodes that have a namespace of "itms"
       //Don't ask me to explain the syntax.

       IXmlNamespaceResolver  nsResolverImg =
       nav.Clone().SelectSingleNode("//*[namespace::itms]");

       //Picking a node that is a "coverArt" node, in the itms namespace with an
       // attribute of height=100

       XPathNavigator img =
       nav.SelectSingleNode("itms:coverArt[@height='100']", nsResolverImg);

       String strImage = img.InnerXml;  //retrieves "theImage.jpg"
  }
  catch (Exception ex)  {  }

}

Simple?  Yes, once you know the trick.  Intuitive it is definitely NOT.
What’s with all this //*[namespace::itms] stuff anyway?  Why can’t I just call this:
XPathNavigator img = nav.SelectSingleNode("itms:coverArt[@height='100']");
without all the IXmlNamespaceResolver stuff?  Isn’t that just as clear?  Well, apparently the designers at Microsoft didn’t think so.

Hope this has helped you in solving a similar issue. If you run across any variations of this, please post them for the benefit of all.  I would especially like to see some details on how the “//*[” works with other scenarios in the SelectSingleNode method.

Microsoft guys: How about some better docs – with examples – on this stuff?

midniteblogger.


					
2007
06.09

Having recently purchased a new Gateway desktop computer recently, I was pleased to learn it had Vista Ultimate loaded on it.   This is Microsoft’s top of the line OS so I was expecting great things.
Having also heard that Vista really sucks up the RAM, I purchased way more than recommended — 4 GB.
After firing up the PC and booting into the OS, however, I was disappointed upon checking Task Manager and learning that Vista was only recognizing 3 GB of my RAM!  Bummer.
BIOS told me there were 4 GB present, so it must be a 32 bit Vista issue with the OS only seeing 3.
Being a programmer, I concluded that 32 bit Vista could only handle 3 GB of RAM.  (Seemed to make sense at the time but I started thinking, why should it be a problem for this cool new OS to see all the memory in my machine?)
Because I was at Tech Ed in Orlando this week, I decided to ask the friendly folks at the Microsoft Vista booth about my dilemma in losing the use of 1 GB of RAM.
The guy in the booth didn’t really understand why it wasn’t working and thought Vista might need some kind of /3GB entry in the boot.ini file.
 He looked around for awhile on Live Search and came up with a couple of articles on MSDN about modifying the boot.ini that looked a bit dated on the subject and not totally relevant.
In the end, he wasn’t much help at all.
I’m having a hard time believing I have to spring for a 64 bit OS just to get the use of all 4 GB of RAM (Couldn’t even NT 4 use more than that?)
Would love to hear from anyone with ideas on this.
Midniteblogger