Reading and Writing XML in .NET Version 2.0 - Part 3

"Loading and Persisting XML with an XML Document Store Object"

©2005 Alex Homer, Stonebroom Limitedalex@stonebroom.com                                                     

Read part 2 ...

 

This is the third in a series of three articles that look in detail at how the new features within the System.Xml namespace in version 2.0 of the .NET Framework can be used to read and write XML documents, and interact with the new XML document store objects. The topics covered in the previous article are:

 

 

In this final article, we look at how the updated XML document store objects XmlDocument, XmlDataDocument and XPathDocument can be used - both stand-alone and in conjunction with the XmlWriter and XmlWriterSettings classes - to read, persist and write XML documents and fragments more easily and more efficiently than in version 1.x of .NET. The topics we'll be covering are:

 

 

As in the previous article, we'll look into the issues involved in using the new classes, the reasoning behind the changes, and how the new features simplify your code and provide better overall efficiency for your applications.

The Version 2.0 XML Document Stores

In the previous two articles, you've seen how the new XmlReader and XmlWriter classes are both powerful, and yet easy to use. They provide the best performance when reading, streaming and writing XML documents where you don't actually need to persist the XML content as a tree of nodes in memory. However, there are times when it is necessary to load and be able to access a complete XML document - for example when you want to perform random access to the content, use XPath queries to locate fragments or individual elements (perhaps to perform XST-T transformations on subsets of elements), or apply business rules that require access to multiple sections of the document.   

 

As in version 1.x, there are three XML document objects in the System.Xml and its subsidiary namespaces. However, the updated XmlDocument class gains many new features compared to the version 1.x equivalent. The three objects are:

 

 

 

 

The XmlDocument class is the primary choice for tasks where an in-memory store is required, unless you can achieve all the requirements using the XPathDocument class with XPath queries. The XmlDataDocument should only be used where you need to access the contents as a DataSet.

Methods for Loading and Persisting Data with the XML Document Stores

This section provides a brief summary of the methods and properties that you can use to load and save XML data from document stores. It will help you to choose the correct one, and help you see how the techniques described in the remainder of this article relate to the three objects. Table 1 lists the relevant methods and properties, showing which are available on each of the three document stores, and the editing capability of the XPathNavigator that is returned from the CreateNavigator method.

 

Table 1 - The Methods and Properties for Reading and Writing XML with an XML Document Store

Method/Property

XmlDocument

XmlDataDocument

XPathDocument

Description

Load

Yes

Yes

No

Loads XML from a Stream, a URI or disk file, an XmlReader or a TextReader.

LoadXml

Yes

Yes

No

Loads the XML contained in a String.

ReadNode

Yes

Yes

No

Creates an XmlNode instance from the contents of an XmlReader, which can then be inserted into the document.

Save

Yes

Yes

No

Saves the XML content to a Stream, a URI or disk file, an XmlReader or a TextReader.

WriteTo

Yes

Yes

No

Writes the complete current node and all its content to an XmlWriter.

WriteContentTo

Yes

Yes

No

Writes just the content of the current node and its child nodes to an XmlWriter.

InnerXml

Yes

Yes

No

Gets or sets the XML content of the current node, including the markup it contains.

InnerText

Yes

Yes

No

Gets or sets just the text content of the current node and its child nodes.

OuterXml

Yes

Yes

No

Gets or sets the XML content and containing tags of the current node, including the markup it contains.

CreateNavigator

Read/Write

Read/Write

Read Only

Creates an XPathNavigator at the current location within the document.

 

Table 2 shows the methods and properties of the XPathNavigator class that relate to reading and writing XML content from an XML document store to which it is attached. Notice that the CanEdit property can be queried to see if the underlying document store supports editing of the XML content.

 

Table 2 - The Methods and Properties for Reading and Writing XML with an XPathNavigator

Method/Property

Description

ReadSubtree

Returns an XmlReader pointing to the current node, allowing this node and its content to be read from the XML document one node at a time, or streamed into another object such as XslCompiledTransform.

WriteSubtree

Returns an XmlWriter pointing to the current node, allowing this node and its content to be written to a disk file, a Stream, or another writer object instance.

InnerXml

Gets or sets the XML content of the current node, including the markup it contains. Cannot be used to set the content in an XPathDocument.

OuterXml

Gets or sets the XML content and containing tags of the current node, including the markup it contains. Cannot be used to set the content in an XPathDocument.

CanEdit

Indicates if this XPathNavigator supports editing of the XML document.

 

Reading and Writing XML with an XPathDocument

As you can see from the tables above, the XPathDocument provides no read/write support directly. You create an XPathDocument by specifying the source XML in the constructor (there is no Load method), but you can only access the content in read-only fashion via an XPathNavigator. We've created an example named xpathdocument.aspx that demonstrates the features for reading and writing XML with an XPathDocument instance, which we'll describe next. You can run or download all of the samples from our Website at http://www.daveandal.net/articles/readwritexml/. All the examples contain a [view source] link that you can use to view the source code.

Loading an XML Document

The first step is to load an XML document. Unlike the other two XML document stores, the XPathDocument does not have a Load method - so you must specify a Stream, an XmlReader, a TextReader or the URI or path of a file or resource that contains the XML in the constructor. You can optionally specify the white-space handling when using an XmlReader, or a URI/file path. The available values are XmlSpace.Default, XmlSpace.None and XmlSpace.Preserve. The code below creates a StringBuilder to hold the results for the page, and the paths to the input and output documents we'll be using:

 

Dim builder As New StringBuilder()

Dim xp As XPathDocument = Nothing

Dim sInPath As String = Server.MapPath("data/slides.xml")

Dim sOutPath As String = Server.MapPath("output/writesubtree.xml")

 

Try

  ' create an XPathDocument containing the XML document

  xp = New XPathDocument(sInPath)

  builder.Append("<p><b>Loaded XPathDocument</b> with " & sInPath & "</p>")

Catch ex As Exception

  builder.Append("<p><b>ERROR creating XPathDocument:</b><br />")

  builder.Append("Message = " & ex.Message & "</p>")

  Return

End Try

Creating an XPathNavigator and Displaying its Properties

To access the content, we must create an XPathNavigator over the document. In the next section of code, we do this and move it to the first slide element in the document. Then we can display the values of the properties you saw listed in Table 1:

 

' create XPathNavigator over document and move to first <slide> element

Dim xn As XPathNavigator = xp.CreateNavigator()

xn.MoveToFirstChild()   ' move to root element

xn.MoveToFirstChild()   ' move to session element

xn.MoveToFirstChild()   ' move to slides element

xn.MoveToFirstChild()   ' move to slide element

builder.Append("<p><b>Created XPathNavigator</b> and moved to the first <b>" _

    & xn.Name & "</b> element</p>")

 

' display property values from XPathNavigator  

builder.Append("<p>XPathNavigator.<b>CanEdit</b> property = " _

    & xn.CanEdit.ToString() & "</p>")

builder.Append("<p>XPathNavigator.<b>OuterXml</b> property = " _

    & Server.HtmlEncode(xn.OuterXml) & "</p>")

builder.Append("<p>XPathNavigator.<b>InnerXml</b> property = " _

    & Server.HtmlEncode(xn.InnerXml) & "</p>")

 

Figure 1 shows the results of running this page, and you can see the values of the properties extracted in the previous section of code. Notice that the CanEdit property returns False, because the XPathDocument is read-only and so the XPathNavigator reflects this. You can also see the difference between the InnerXml and OuterXml properties in this screenshot - the InnerXml property does not include the start and end tags of the slide element.

 

Figure 1 - Reading and writing XML with an XPathDocument instance

 

Using the ReadSubtree Method of the XPathNavigator

Next the code calls the ReadSubtree method of the XPathNavigator to get back an XmlReader positioned at the current node in the document (the first slide node). Using code similar to that we described in the first article in this series, we can iterate through the nodes exposed by the XmlReader and display the contents. If you look back at Figure 1 you'll see the results:

 

' create XmlReader and read current node using ReadSubtree method

Try

  Dim xr As XmlReader = xn.ReadSubtree()

  builder.Append("<p>Values from the <b>ReadSubtree</b> method:<br />")

  While xr.Read()

    If xr.IsStartElement() Then

      ' write out element name

      builder.Append("Element Name: " & xr.Name & "<br />")

      ' see if this element has any attributes

      If xr.HasAttributes Then

        While xr.MoveToNextAttribute()

          ' iterate through the attributes displaying the

          ' name and the value of each one

          builder.Append(" - Attribute Name: " & xr.Name)

          builder.Append(" &nbsp; Value: '" & xr.Value & "'<br />")

        End While

      End If

    End If

    ' if this is a text node then just display the value.

    If xr.NodeType = XmlNodeType.Text Then

      builder.Append("Element Value: '" & xr.Value & "'" & "<br />")

    End If

  End While

  builder.Append("</p>")

Catch ex As Exception

  builder.Append("<p><b>ERROR executing the ReadSubtree method:</b><br />")

  builder.Append("Message = " & ex.Message & "</p>")

End Try

Using the WriteSubtree Method of the XPathNavigator

While the ReadSubtree method returns an XmlReader, the WriteSubtree method accepts instead an existing XmlWriter instance, and writes the content of the current node to that writer. The following code shows the XmlWriter being created with an

XmlWriterSettings instance, and static Create method (as described in the previous article). The writer is then passed to the WriteSubtree method. Afterwards, the disk file is read using the File.ReadAllText method and added to the StringBuilder, the contents of which are then dumped into a Label control on the page to display the results:

 

' create XmlWriter and write contents of current element to disk

Dim ws As New XmlWriterSettings()

ws.Indent = True

Dim xw As XmlWriter = Nothing

Try

  xw = XmlWriter.Create(sOutPath, ws)

  xn.WriteSubtree(xw)

Catch ex As Exception

  builder.Append("<p><b>ERROR executing the WriteSubtree method:</b><br />")

  builder.Append("Message = " & ex.Message & "</p>")

Finally

  xw.Close()

End Try

 

' read new disk file and display content in the page

builder.Append("<p>Contents of file created with the <b>WriteSubtree</b> method:")

builder.Append("<pre>" & Server.HtmlEncode(File.ReadAllText(sOutPath)) & "</pre></p>")

 

' display the results

Label1.Text = builder.ToString()

 

Looking back at Figure 1, you can see that the WriteSubtree method creates the following XML document to represent the contents of the first slide node:

 

<?xml version="1.0" encoding="utf-8"?>

<slide position="1" xmlns="http://myns/slidesdemo">

  <title>Agenda</title>

  <rv:reviewed xmlns:rv="http://myns/slidesdemo/reviewdate">2004-05-10T00:00:00</rv:reviewed>

</slide>

 

Reading and Writing XML with an XmlDocument

Having seen how the XPathDocument provides read-only access to XML documents, and supports techniques for persisting the XML in various ways, we'll now look at the XmlDocument class. This is a far more complex class, providing full support for the XML DOM Level 2 methods. However, in line with the topics of these three articles, we'll be concentrating here on the capabilities for reading and writing XML.

 

Back in Table 1, we showed the properties and methods that are available for the XmlDocument class. The second example, xmldocument.aspx, demonstrates these in a single page that contains three buttons:

 

 

There are also common routines within the page that demonstrate loading and validating an XML document, which apply to both the XmlDocument and XmlDataDocument classes. We'll look at these first.

Loading and Validating an XML Document

The XmlDocument and XmlDataDocument have a default constructor, which takes no parameters, and so you have to load them with the XML source document using one of the other methods (other overloads of the constructor accept only a standard or a specific implementation of a NameTable that defines the namespaces for the document). Usually, you will load your XML documents using the Load or LoadXml method, both of which are described in Table 1 earlier in this article.  

 

In this example, we created a custom function named GetXmlDocument that uses the Load method, and specifies the path to one of two XML disk files. If the user has set a checkbox named chkLoadInvalid, then we load an XML document that is invalid against its schema. Otherwise we load a valid document:

 

Function GetXmlDocument() As XmlDocument

  ' returns a populated and (if specified) validated XmlDocument instance

  Try

    ' create an XmlDocument and load the XML document

    Dim doc As New XmlDocument()

    Dim sPath As String = sValidInPath

    If chkLoadInvalid.Checked Then

      sPath = sInvalidInPath

    End If

    doc.Load(sPath)

    builder.Append("<p><b>Loaded XmlDocument</b> with " & sPath & "</p>")

   ...

 

The function then determines if the user has ticked the chkValidate checkbox. If they have, we must create a populated XmlSchemaSet containing the schemas required to validate this document (these are the same as we used in the first article), and assign this to the Schemas property of the XmlDocument. We also call the Validate method of the document at this point, specifying a reference to an event handler that will be executed when any validation errors occur. We'll look at this event handler shortly. In the meantime, after validation, we return the populated XmlDocument instance from the function, or Nothing (null in C#) if there is an error loading the XML from the disk file:

 

    ...

    ' see if validation is required for the document

    If chkValidate.Checked Then

      builder.Append("<p>Checking the validity of the document...<br />")

      ' get XmlSchemaSet instance

      Dim xs As XmlSchemaSet = GetXmlSchemaSet()

      doc.Schemas = xs

      doc.Validate(AddressOf MyValidationHandler)

    End If

    Return doc

  Catch ex As Exception

    ' ... display error details here ...

    Return Nothing

  End Try

End Function

 

The GetXmlSchemaSet function that creates the XmlSchemaSet we used in the previous section of code is shown next. It performs the same tasks as we did when setting up validation for an XmlReader through an XmlReaderSettings instance in the first article:

 

Function GetXmlSchemaSet() As XmlSchemaSet

  ' creates and populates an XmlSchemaSet for the slides.xml document

  Dim ss As New XmlSchemaSet()

  ss.Add("http://myns/slidesdemo", Server.MapPath("data/schema/slides.xsd"))

  ss.Add("http://myns/slidesdemo/reviewdate", Server.MapPath("data/schema/slidesrev.xsd"))

  Return ss

End Function

 

The event handler we referenced when we called the Validate method of the XmlDocument is shown next. Again, this is similar to the handler we used in the examples that read and validated an XML document using an XmlReader in the first article. Both accept a ValidationEventArgs instance that contains details of the error:

 

Sub MyValidationHandler(ByVal sender As Object, ByVal args As ValidationEventArgs)

  If args.Severity = XmlSeverityType.Warning Then

    builder.Append("<b>Warning:</b> ")

  ElseIf args.Severity = XmlSeverityType.Error Then

    builder.Append("<b>Error:</b> ")

  End If

  builder.Append(args.Message & "<br />")

End Sub

 

If you run the example and set the two checkboxes, then click the Show Properties button (or, in fact, any of the three buttons, as they all use the same GetXmlDocument method), you'll see the output from the event handler in the page (see Figure 2).

 

Figure 2 - Loading an XML document and validating it with the Validate method of the XmlDocument class

 

 

While validation is simple in an XmlDocument or an XmlDataDocument using the Validate method, this isn't available in an XPathDocument. However, you can enable validation on an XmlReader and use this as the source of the XML when creating an XPathDocument. Alternatively, as you'll see later, you can perform validation using an XPathNavigator over any XML document store.

Reading the InnerXml, OuterXml and InnerText Properties

When you click the Show Properties button in the example page, the code simply extracts and displays the values of the OuterXml, InnerXml and InnerText properties of the XmlDocument, as shown in the following listing:

 

' get a populated XmlDocument instance

Dim xd As XmlDocument = GetXmlDocument()

 

If Not xd Is Nothing Then

  ' display property values from XmlDocument  

  builder.Append("<p><b>DocumentElement.OuterXml</b> property = " _

      & Server.HtmlEncode(xd.DocumentElement.OuterXml) & "</p>")

  builder.Append("<p><b>DocumentElement.InnerXml</b> property = " _

      & Server.HtmlEncode(xd.DocumentElement.InnerXml) & "</p>")

  builder.Append("<p><b>DocumentElement.InnerText</b> property = " _

      & xd.DocumentElement.InnerText() & "</p>")

End If

 

' display the results

Label1.Text = builder.ToString()

 

Figure 3 shows the results (validation is not turned on in this case). You can clearly see that the OuterXml property returns a String that contains the root element tags, while the InnerXml property returns a String that does not. The InnerText property returns a concatenated String containing the values of all the text nodes (the element values). This does not include the values of the attributes on the elements in the document.

 

Figure 3 - The results of querying the OuterXml, InnerXml and InnerText properties

Using the ReadNode and Save Methods

The second button in the example page, Read & Save, demonstrates how you can use four of the methods of the XmlDocument class that are concerned with reading and writing XML. First, after loading the same XML document (either a valid or an invalid one, depending on the settings in the second checkbox), it uses the ReadNode method to read XML from a disk file though an XmlReader. This creates a new XmlNode instance, which can then be inserted into the document at the appropriate point.

 

The XML we're reading in is a slide node containing title and reviewed elements, just as you see in the original XML document. Notice that we have to specify the namespace for the reviewed element prefix rv, or the reader will be unable to parse it correctly. However, we haven’t specified the default namespace:

 

<slide position="4" xmlns:rv="http://myns/slidesdemo/reviewdate">

  <title>An Extra Slide</title>

  <rv:reviewed >2005-06-02T00:00:00</rv:reviewed>

</slide>

 

The code that reads this node and inserts it into the document is shown next. We use the static Create method and an XmlReaderSettings instance to create the XmlReader, the move to the actual contents of the XML file using the MoveToContent method. Then we call the ReadNode method of the XmlDocument, passing in the XmlReader, and get back a reference to the new XmlNode. We then get a reference to the slides element within the XML we originally loaded into the XmlDocument - using the XML DOM properties DocumentElement and FirstChild to walk down the node tree to it. At this point we can use the XML DOM AppendChild method to insert the new slide node into the document:

 

' get a populated XmlDocument instance

Dim xd As XmlDocument = GetXmlDocument()

If Not xd Is Nothing Then

 

  ' add a new node to the document using the ReadNode method

  builder.Append("<p>Adding a new node using the <b>ReadNode</b> method...<br />")

  ' create an XmlReader over the disk file containing the new XML node

  Dim rs As New XmlReaderSettings()

  rs.CloseInput = True

  Dim xr As XmlReader = Nothing

  Try

    xr = XmlReader.Create(sNewNodePath, rs)

    ' move the reader to the XML content and read the node

    xr.MoveToContent()

    Dim en As XmlNode = xd.ReadNode(xr)

    ' now insert the new node into the XML document

    Dim slides As XmlNode = xd.DocumentElement.FirstChild.FirstChild

    slides.AppendChild(en)

  Catch ex As Exception

    ' ... display error details here ...

  Finally

    xr.Close()

  End Try

  ...

 

Then we can save the updated document to a disk file by calling the Save method. This method can acceopt a Stream, an XmlWriter, a TextWriter or the path of a disk file; but here we're using a disk file. After writing the document to disk, we read it back and display it:

 

  ...

  ' save the entire XmlDocument contents to a disk file

  Try

    xd.Save(sOutPath)

    builder.Append("<p><b>Saved XmlDocument</b> to disk file, content is:")

    builder.Append("<pre>" & Server.HtmlEncode(File.ReadAllText(sOutPath)) & "</pre>")

  Catch ex As Exception

    ' ... display error details here ...

  End Try

 

Figure 4 shows the results of the code you've seen so far. You can see the new XML document with the added slide node at the end.

 

Figure 4 - Using the ReadNode and Save methods of the XmlDocument class to update an XML document

Using the WriteTo and WriteContentTo Methods

The code you've just seen that runs when you click the Read & Save button continues by demonstrating the WriteTo and WriteContentTo methods. It selects a node within the document (the first reviewed element) using an XPath expression with the SelectSingleNode method of the XmlDocument. However, for this to be possible, we have to provide a NameSpaceManager containing the both the default and the "rv" namespace and prefix declarations used in the document for the second parameter of the SelectSingleNode method. The code we use here is very similar to that in the first article, where we had to do the same thing when using an XmlReader.

 

Both WriteTo and WriteContentTo methods take an XmlWriter as the target, and so - after selecting the reviewed node - the code generates an XmlWriterSettings instance to use when creating these writers. After creating the first XmlWriter we call the WriteTo method of the selected node to write the node and its content to a disk file, then read it back and display it:

 

  ' select a single node in document, an XmlNamespaceManager

  ' is required to resolve the document's namespaces

  Dim ns As XmlNamespaceManager = New XmlNamespaceManager(xd.NameTable)

  ns.AddNamespace(String.Empty, "http://myns/slidesdemo")

  ns.AddNamespace("rv", "http://myns/slidesdemo/reviewdate")

  Dim node As XmlNode = xd.SelectSingleNode("descendant::rv:reviewed[1]", ns)

  ' create XmlWriter and write this node to disk using WriteTo method

  Dim ws As New XmlWriterSettings()

  ws.Indent = True

  Dim xw As XmlWriter = Nothing

  Try

    xw = XmlWriter.Create(sOutPath, ws)

    node.WriteTo(xw)

  Catch ex As Exception

    ' ... display error details here ...

  Finally

    xw.Close()

  End Try

  builder.Append("<p>First <b>reviewed</b> element written to disk file using " _

               & "the <b>WriteTo</b> method, file contains:")

  builder.Append("<pre>" & Server.HtmlEncode(File.ReadAllText(sOutPath)) & "</pre>")

  ...

 

Then we can create another XmlWriter and use it in a call to the WriteContentTo method. In this case, the output is not actually well-formed XML. It's just the text value of the node (effectively an XML fragment), and so we have to set the conformance level of the XmlWriter to ConformanceLevel.Fragment. The contents of StringBuilder we've been using to hold the output can then be dumped into the Label control on the page:

 

  ...

  ' now write this node to disk using WriteContentTo method

  ' as it contains only text, must set the ConformanceLevel

  ws.ConformanceLevel = ConformanceLevel.Fragment

  Try

    xw = XmlWriter.Create(sOutPath, ws)

    node.WriteContentTo(xw)

  Catch ex As Exception

    ' ... display error details here ...

  Finally

    xw.Close()

  End Try

  builder.Append("<p>First <b>reviewed</b> element written to disk file using the <b>WriteContentTo</b> method, file contains:")

  builder.Append("<pre>" & Server.HtmlEncode(File.ReadAllText(sOutPath)) & "</pre>")

 

End If

 

' display the results

Label1.Text = builder.ToString()

 

Figure 5 shows the output from this section of code. You can clearly see that the WriteTo method outputs the start and end tags of the current node, while the WriteContentTo method does not.

 

Figure 5 - Using the WriteTo and WriteContentTo methods of the XmlDocument class

Re-Validating the XML Document after Editing

One thing that is not revealed in the preceding screenshots is that the edited XML document is not actually valid against the schema. The editing process does force the content to be well-formed, and raises an error if you try to insert new content or modify the existing content so that it is not well-formed. However, if you turn on validation and then click the Read & Save button again, you'll see that a validation error is reported after the ReadNode method has been used to insert the new node into the document (shown in Figure 6).

 

Figure 6 - Re-validating the document after the ReadNode method has inserted the new slide node

 

This is because we include the following code that re-validates the document (by calling the Validate method again) after the ReadNode method has been used to insert the new slide node:

 

  If chkValidate.Checked Then

    ' recheck validity of document - schema is already attached

    builder.Append("<p>Rechecking the validity of the document...<br />")

    xd.Validate(AddressOf MyValidationHandler)

  End If

 

The validation error occurs because the new slide node we inserted into the XML does not have the same default namespace as rest of the document. The original document contains these namespace declarations in the root element:

 

<root xmlns="http://myns/slidesdemo" xmlns:rv="http://myns/slidesdemo/reviewdate">

 

The other slide elements live in this namespace, because they have no prefix or namespace declaration of their own:

 

  <slide position="1">

    <title>Agenda</title>

    <rv:reviewed>2004-05-10T00:00:00</rv:reviewed>

  </slide>

 

However, as we failed to provide this namespace declaration on the new slide node we inserted, it contains the attribute xmlns="", which "escapes it" from the current namespace:

 

  <slide position="4" xmlns:rv="http://myns/slidesdemo/reviewdate" xmlns="">

    <title>An Extra Slide</title>

    <rv:reviewed>2005-06-02T00:00:00</rv:reviewed>

  </slide>

 

Therefore, it does not conform to the schema for the document, and so an error is raised when revalidation takes place.

Using the CheckValidity Method of the XPathNavigator

We mentioned earlier that you can also validate an XML document stored in any of the three XML document stores using an XPathNavigator. We demonstrate this in the example page - in the output you see when you click the Create Navigator button. Most of this page is identical to that we used with an XPathDocument earlier in this article, and so we haven't repeated the code or description of its workings here. However, the one difference is that we check if the chkValidate checkbox is set, and if so execute the following code:

 

If chkValidate.Checked Then

  ' check validity of this part of document using the XPathNavigator

  builder.Append("<p>Checking the validity of the current node...<br />")

  builder.Append("XPathNavigator.<b>CheckValidity</b> method returned " _

      & xn.CheckValidity(Nothing, AddressOf MyValidationHandler) & "</p>")

End If

 

This code simply calls the CheckValidity method of the XPathNavigator, passing in two parameters. The first is an XmlSchemaSet containing the schemas to use for validation, and the second is a reference to the event handler that will be executed when validation errors are detected. As the document already has an XmlSchemaSet attached to the Schemas property, we can use Nothing (null in C#) for the first parameter (you will usually use this parameter only when validating XML fragments). The second parameter is optional (there is an overload of the CheckValidity method that does not require it), and if not specified any validation errors will cause a runtime error to be thrown.

 

So if you run the example, turn on validation and select an invalid document, and then click the Create Navigator button, you'll see output like that shown in Figure 7. The XML document is first validated by the routine that loads the document into the XmlDocument instance (as before), creates an XPathNavigator and moves it to the second slide element (the one that contains invalid content) and then the properties of the XPathNavigator are displayed. After that, the CheckValidity method is called to check the current node against the schema for the document, and you can see that it reports the same two errors as the Validate method called directly on the XmlDocument.

 

Figure 7 - Using the CheckValidity method of the XPathNavigator class

 

Reading and Writing XML with an XmlDataDocument

The final example for this article, xmldatadocument.aspx, demonstrates the extra feature of the XmlDataDocument over the XmlDocument. Remember that XmlDataDocument inherits from XmlDocument, and so it contains all the features you've just seen for the XmlDocument class. The one addition we'll focus on here is that ability of the XmlDataDocument to expose the XML it contains as a standard ADO.NET DataSet instance.

 

In fact, this is a simplification in that there is no difference between the data exposed by an XmlDataDocument and the data in the DataSet it exposes - they are just two views of the same underlying data. If you change the content in one view, the other view will contain these changes. What it does do, however, is allow you to use relational methods to manipulate XML data, and vice versa. The example we provide demonstrates this is a useful way, as well as showing how you can use the extra features of the XmlDataDocument class.

 

As with other example, the code declares some variables that will be used within the main routine. We'll need an XML document and the matching schema, and a path for the output file we'll be creating:

 

' create StringBuilder to hold the results

Dim builder As New StringBuilder()

 

' create path to XML and schema disk files

Dim sInPath As String = Server.MapPath("data/slides-for-datadoc.xml")

Dim sSchemaPath As String = Server.MapPath("data/schema/slides-for-datadoc.xsd")

 

' create path to disk file for output XML file

Dim sOutPath As String = Server.MapPath("output/xmldatadocument.xml")

Loading an XmlDataDocument with XML Data and a Schema

The XmlDataDocument exposes the same Load and LoadXml methods as the XmlDocument. However, we want to be able to access the XML data as a relational view, using the DataSet property of the XmlDataDocument. For this to be possible, we have to load a schema into the XmlDataDocument first so that the structure and types within the XML document can be properly determined. But how do we "load a schema" into an XML parser? Then answer is that we don't - we load it into the DataSet exposed by the XmlDataDocument.

 

So, our code first creates a new XmlDataDocument, and then gets a reference to the ADO.NET DataSet that is exposed by the DataSet property. It then calls the ReadXmlSchema method of the DataSet. Like the Load method of the XML document stores, this method accepts as the input source a Stream, a TextReader, an XmlReader or a URI/path to a file that contains the schema. We're using the last of these in our example. Then we can load the XML document directly into the XmlDataDocument:

 

Dim xd As XmlDataDocument

Dim ds As DataSet

Try

  ' create a new empty XmlDataDocument instance

  xd = New XmlDataDocument()

  ' get a reference to the DataSet exposed by the XmlDataDocument

  ds = xd.DataSet

  ' have to load the schema for the XML into the DataSet first

  ' or else the property will not expose the populated DataSet

  ds.ReadXmlSchema(sSchemaPath)

  ' now load the data from the XML document

  xd.Load(sInPath)

Catch ex As Exception

  ' ... display error details here ...

  Return

End Try

...

Accessing the XML as a Relational View

Now, the DataSet property will expose the XML data as a DataSet. Depending on the hierarchy of the elements in the XML document, you may get more than one table in the resulting DataSet. However, we've chosen an XML document that has a flat hierarchy (a single root element that contains multiple slide elements - each of which contains just a title, reviewed and usage-cost element). This means that we'll just get a single table, with columns for the title, reviewed and usage-cost elements.

 

The quickest and easiest way to see what the DataSet contains is to bind it to a grid control - in our example this is the ASP.NET GridView. We declare this in the page like this:

 

<asp:GridView ID="grid1" runat="server" />

 

The code in our example then binds the GridView to the first table in the DataSet:

 

...

' bind the first table in the DataSet to the GridView control

grid1.DataSource = ds.Tables(0)

grid1.DataBind()

...

 

Figure 8 shows the results. You can see the grid at the top of the page, containing five rows that represent the five slide elements in the XML document (ignore the remainder of the output in the page for the moment - we'll be coming to that next.

 

Figure 8 - Using an XmlDataDocument to sort and filter the contents of an XML document

 

Selecting and Filtering the XML Content

Now that we know we have a DataSet containing some rows, we can play with it. A useful trick is to perform filtering and sorting on the rows, which is effectively equivalent to applying an XSL-T style sheet to the XML document. If you struggle with XSL-T, but are familiar with ADO.NET, this might be a technique worth looking into.  

 

The code in our example creates an array of DataRow references from the DataSet table using the Select method of the DataTable class, specifying only rows where the value in the usage-cost column is greater than 4 and sorting the resulting rows by the usage-cost value:

 

...

Dim sFilter As String = "[usage-cost] > 4"

Dim sSort As String = "[usage-cost]"

Dim rowsArray() As DataRow = ds.Tables(0).Select(sFilter, sSort)

...

Creating a Custom XML Document with an XmlWriter

Now that we have an array of DataRow references, we can use this to generate the output XML document. This hinges on two main features: the GetElementFromRow method of the XmlDataDocument that returns an XML element representing the DataRow we pass to it, and the willingness of the XmlWriter to accept complete XML elements as String values.

 

So, the code now creates an XmlWriterSettings instance, and sets the required encoding for the output. Notice that we're using Unicode (UTF-16) here, whereas the source XML document was UTF-8 encoded, so this provides another way to re-encode an XML document as well as creating custom content for it. We haven't bothered setting the Indent property, because this has no effect when we write multiple nested XML elements as String values through the XmlWriter. Instead, we'll add the line breaks we want as we create the output document.

 

We use the  XmlWriterSettings instance when we create the XmlWriter, then start the new document by calling the WriteStartDocument method. We also insert a comment into the document, together with our line breaks. These can be inserted by calling the WriteRaw method of the XmlWriter, which simply takes the String value and writes it to the output (there is also an overload of the WriteRaw method that uses a character buffer instead of a String).

 

Next, we iterate through the DataRow references in our array, and for each one get a String that represents the element and all its content using the GetElementFromRow method of the XmlDataDocument. This string is then written to the output, followed by a carriage return. Once all the rows have been processed, we complete the XML document by calling the WriteEndElement and WriteEndDocument methods of the XmlWriter:

 

...

' create an XmlWriter instance with Unicode encoding

Dim ws As New XmlWriterSettings()

ws.Encoding = Encoding.Unicode

Dim xw As XmlWriter = Nothing

Dim node As XmlNode = Nothing

Try

  xw = XmlWriter.Create(sOutPath, ws)

  ' now ready to start creating the output document

  ' indenting is off, so manually add line breaks

  xw.WriteStartDocument()

  xw.WriteRaw(ControlChars.CrLf)

  xw.WriteStartElement("root")

  xw.WriteRaw(ControlChars.CrLf)

  xw.WriteComment("XML extracted from DataSet table")

  xw.WriteRaw(ControlChars.CrLf)

  ' iterate through the rows selected in the array

  For Each row As DataRow In rowsArray

    ' extract an XML element that represents the row

    node = xd.GetElementFromRow(row)

    ' and write it to the disk file

    xw.WriteRaw(node.OuterXml & ControlChars.CrLf)

  Next

  xw.WriteEndElement()

  xw.WriteEndDocument()

Catch ex As Exception

  ' ... display error details here ...

Finally

  xw.Close()

End Try

builder.Append("<p>XML file created using the <b>GetElementFromRow</b> method, file contains:")

builder.Append("<pre>" & Server.HtmlEncode(File.ReadAllText(sOutPath)) & "</pre>")

 

' display the results

Label1.Text = builder.ToString()

 

The final step is to dump the contents of the StringBuilder into the Label control on the page. You can look back at Figure 8 to see the XML document that is created, however that fact that there are no line breaks between each slide element makes it hard to see the results. The document with added carriage returns is shown below to make is easier to appreciate what we've achieved with only a few lines of simple code:

 

<?xml version="1.0" encoding="utf-16"?>

<root>

 <!--XML extracted from DataSet table-->

 <slide>

  <title>Agenda</title>

  <review-year>2004</review-year>

  <review-month>5</review-month>

  <review-day>10</review-day>

  <usage-cost>4.90</usage-cost>

 </slide>

 <slide>

  <title>Code Examples</title>

  <review-year>2004</review-year>

  <review-month>12</review-month>

  <review-day>19</review-day>

  <usage-cost>7.55</usage-cost>

 </slide>

 <slide>

  <title>Techniques</title>

  <review-year>2005</review-year>

  <review-month>4</review-month>

  <review-day>22</review-day>

  <usage-cost>9.5</usage-cost>

 </slide>

</root>

Summary

In this, the last of a series of three articles that discuss techniques for reading and writing XML in version 2.0 of the .NET Framework, we've looked some of the different ways that you can load, persist and write XML using the three XML document stores available in version 2.0 or the .NET Framework.

 

We looked at how you create an XPathDocument containing an XML document, and how you can navigate through it using an XPathNavigator. The XPathNavigator exposes properties such as InnerXml and OuterXml that allow you to extract the content, as well as the ReadSubtree and WriteSubtree methods that return individual nodes and their content.

 

Next, we examined the XmlDocument, seeing how it provides extra properties that expose the content, and methods that allow you to load and save the XML content. You can also insert new nodes by reading them into the document through an XmlWriter, and write out nodes and their content using the WriteTo and WriteContentTo methods. 

 

We also examined techniques for validating an XML document whilst it is loaded into an XML document store, obviating the needs to create an