©2005 Alex Homer, Stonebroom Limited – alex@stonebroom.com
This is the second in a series of three articles that look in detail at how the new features of the XmlReader and XmlWriter classes in version 2.0 of the .NET Framework can be used to read and write XML documents, and interact with the new XML document store objects. The topics covered in the previous article are:
In this article, we move on to look at the XmlWriter and XmlWriterSettings classes, and how they can be used to write XML documents and fragments more easily and more efficiently than in version 1.x of .NET. The topics we'll be covering are:
As in the previous article, we'll look into the issues involved in using the new classes, the reasoning behind the changes, and how the new features simplify your code and provide better overall efficiency for your applications.
The previous article discussed the concepts behind the new "settings" classes for XmlReader and XmlWriter. The two new classes named XmlReaderSettings and XmlWriterSettings can be used to maintain a consistent set of behaviors when generating instances of readers and writers on demand, without having to repeatedly set their properties. This has several benefits in that it:
The "settings" classes are used in the new static Create method, of the XmlReader and XmlWriter classes, as demonstrated in the previous article. So, having seen how you use an XmlReaderSettings instance to specify the behavior of an XmlReader, the way you use the XmlWriterSettings class will be obvious. Of course, XmlWriter has a different set of properties compared to XmlReader, and these properties are reflected in the XmlWriterSettings class. Figure 1 shows the XmlWriterSettings class, and you can see that some of the properties are the similar to XmlReaderSettings. There is a CheckCharacters property that controls whether illegal XML characters are permitted, and a CloseOutput property that automatically closes the underlying output stream when the writer is closed.

Figure 1 - The XmlWriterSettings Class
You can specify if the elements in the output are placed on a new line and indented, whether each attribute is placed on new line, and the characters that are used for the indentation and the line breaks. Even more control over the output format is provided by the NewLineHandling enumeration. By default, any new line characters between elements and attributes in the document are replaced by the standard Windows new line characters \r\n (i.e. the default is NewLineHandling.Replace). However, you can turn off this replacement using the value NewLineHandling.None, or specify that the original new line characters are preserved (except that new line characters separating attributes are replaced by a single space) using the value NewLineHandling.Entitize.
Like XmlReader, you can specify the conformance level for an XmlWriter so that it will generate output that is not actually a complete XML document when you only want to write fragments of XML. For example, as you'll see shortly, you can create XML fragments that contain multiple root nodes.
If you are using the XmlWriter to stream output from an XSL-T transformation, using the new XslCompiledTransform class, the read-only OutputMethod property indicates the format of the output that is generated by the writer. This is a value from the XmlOutputMethod enumeration shown in Figure 1, allowing you to detect whether the output is serialized as HTML, XML, or just the text content is serialized.
Finally, the XmlWriterSettings class exposes a reference to an Encoding class instance that specifies the encoding to be applied to the document that is generated. The default is System.Encoding.UTF8, but you can change this to suit the requirements of your application. Some of the ways that you can use the XmlWriterSettings class are discussed next.
To create an XmlWriter instance, you first instantiate an instance of the XmlWriterSettings class, set the properties you want, and the call the Create method of the XmlWriter class. For example, this code creates an XmlWriter that checks for invalid characters in the XML you write, and closes the underlying stream when the reader is closed. Notice how we use the new Using construct to ensure that the writer is correctly disposed at the end of the process. This ensures that the XmlWriter is closed, even if we forget to call the Close method (the Using construct was available in C# in version 1.1, and is now available in VB.NET in version 2.0):
Dim ws As New XmlWriterSettings()
ws.CheckCharacters = True
ws.CloseOutput = True
Using xw As XmlWriter = XmlWriter.Create(sFilePath, ws)
Try
' ... create the XML document here ...
xw.Close()
Catch ex As Exception
' ... display error details ...
End Try
End Using
You can, of course, create the XmlWriter without taking advantage of the Using method, for example if you want to return it from a function so that it can be accessed from elsewhere in your application. The function shown next creates an XmlWriter using the static Create method, taking as parameters a reference to an XmlWriterSettings instance and a Stream for the output to be sent to:
Function GetXmlWriter(ws As XmlWriterSettings, outStream As Stream) As XmlWriter
Dim xw As XmlWriter = Nothing
Try
xw = XmlWriter.Create(outStream, ws)
Return xw
Catch ex As Exception
Try
xw.Close()
Catch
End Try
Return Nothing
End Try
End Function
You could then use this function like this:
Dim settings As New XmlWriterSettings()
settings.CheckCharacters = True
settings.CloseOutput = True
' write an XML document to the ASP.NET Response
Dim webWriter As XmlWriter = GetXmlWriter(settings, Response.OutputStream)
If Not (webWriter Is Nothing) Then
' ... create the XML document here ...
webWriter.Close()
End If
...
' write an XML document to the screen
settings.Indent = True
settings.Encoding = Encoding.ASCII
Dim screenWriter As XmlWriter = GetXmlWriter(settings, Console.OpenStandardOutput())
If Not (screenWriter Is Nothing) Then
' ... create the XML document here ...
screenWriter.Close()
End If
Once you've created the XmlWriter, you can use it to generate the output you need. The XmlWriter in version 2.0 supports pretty much the same set of methods for writing XML elements, attributes, comments, declaration, processing instructions, etc., as the version 1.x XmlWriter class. So you can create a simple XML document, once you've created the XmlWriter, like this:
...
xw.WriteStartDocument(True)
xw.WriteComment("Created at " & DateTime.Now.ToString("hh:mm:ss"))
xw.WriteStartElement("root-node")
xw.WriteElementString("child-node", "The element value")
xw.WriteStartElement("next-child")
xw.WriteAttributeString("some-attribute", "A value")
xw.WriteAttributeString("another-attribute", "Another value")
xw.WriteValue("The next element value")
xw.WriteEndElement()
xw.WriteEndDocument()
xw.Close()
This creates an XML document containing a root node and two child nodes, with two attributes on the second child node. The result is:
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<!--Created at 13:37:46-->
<root-node>
<child-node>The element value</child-node>
<next-child
some-attribute="A value"
another-attribute="Another value">The next element value</next-child>
</root-node>
Note how the WriteStartElement and WriteElementString methods allow you to create the nested hierarchy. The WriteStartElement method creates the opening element tag, and you then populate the element using the WriteValue, WriteChars or WriteString method. Calling WriteEndElement automatically closes the most recently opened element by writing the appropriate end tag. The WriteElementString method generates a complete element, including the opening and closing tags. Similar methods named WriteStartAttribute, WriteEndAttribute and WriteAttributeString allow you to add attributes to the elements as they are created.
The WriteStartDocument method prepares the writer and generates the opening <?xml version="1.0"?> declaration, while the WriteEndDocument method automatically closes all open elements in the correct order and resets the writer. Other methods of the XmlWriter class (and there are lots of them) include WriteDocType (which creates a !DOCTYPE declaration), WriteComment, WriteCData (which creates a [!CDATA] section), WriteCharEntity, WriteEntityRef, WriteProcessingInstruction and WriteWhitespace. Look up "XmlWriter methods" in the Help file or SDK for a full list and descriptions of each one.
Most of the XML documents you generate will need to include namespace declarations, and the XmlWriter provides support for this in all of the "write" methods that generate elements and attributes. These methods accept a String parameter that is the namespace URI within which the element or attribute will live. If the document contains a declaration of a namespace prefix for that URI, the methods automatically use the prefix when generating the element or attribute by pre-pending it to the local name.
You can see how this works using the example page we provide, named xmlwritersettings.aspx. This page, shown in Figure 2, demonstrates many of the features of the XmlWriterSettings and XmlWriter class that we discuss in this article. You can run or download all of the samples from our Website at http://www.daveandal.net/articles/readwritexml/.

Figure 2 - The XmlWriterSettings and XmlWriter Example Page
We'll look at the various option settings in the page shortly, but for the moment concentrate on the XML document that is generated by the page. This contains a default namespace on the root-node element, as well as the declaration of a namespace prefix "qn" that represents a different namespace. The last element within the root-node element uses this namespace prefix to place it within in the second namespace. Generating this kind of multi-namespace XML document is easy using the XmlReader - you just need to be aware of a few issues that affect the generated output.
The first point to note is when you want to apply a default namespace using the xmlns attribute without a namespace prefix declaration. Following the rules of XML, the root node of your document also lives in this namespace, but the namespace is not declared when you call the WriteStartElement method to create the opening tag for the root element. Therefore you must specify this namespace when you call the WriteStartElement method:
xw.WriteStartElement("root-node", "http://testdemo")
Then you can add the xmlns attribute if you wish, but in fact it's not necessary because the XmlWriter detects that you are now in a new namespace and add the attribute xmlns="http://testdemo" to the element automatically. If you look at the source code for the example page (there is a [view source] link at the bottom of the page), you'll see this in the code comments.
The same process occurs if you call WriteStartElement or WriteElementString and specify a new namespace. The XmlWriter automatically adds an xmlns attribute to the element so that it is the new default namespace for the enclosed XML content:
xw.WriteElementString("child-node", "http://testdemo/children", "The element value")
This creates the element child-node as:
<child-node xmlns="http://testdemo/children">The element value</child-node>
To "escape" from this namespace, and return to the previous namespace, you use the WriteFullEndElement method.
If you look back at Figure 2, you'll see that the last element in the document contains the prefix "qn":
<qn:qualified-node>Another element value</qn:qualified-node>
This namespace prefix is declared in the root-node element as:
<root-node xmlns:qn="http://testdemo/names" xmlns="http://testdemo">
To generate this namespace declaration, we first use the technique described above to generate the root-node element with the default namespace xmlns="http://testdemo", then use the overload of the WriteAttributeString method that accepts four parameters to specify the xmlns attribute name, the namespace prefix qn, omit the namespace for this attribute (so it lives in the default namespace and therefore has no prefix), and the value for this attribute - the namespace we want to associate with the new prefix:
xw.WriteStartElement("root-node", "http://testdemo")
xw.WriteAttributeString("xmlns", "qn", Nothing, "http://testdemo/names")
Now we can generate elements that use the new namespace prefix simply by specifying the namespace when we call the WriteElementString or WriteAttributeString method:
xw.WriteElementString("qualified-node", "http://testdemo/names", "Another element value")
However, the new overload of the WriteElementString method added to the XmlWriter in version 2.0 allows you to generaqte elements in a separate namespace in a single operation - without having to add the namespace prefix declaration to the root node if this suits the document format you want. This overload accepts four String values: the namespace prefix, the element name, the
Namespace URI and the element value - for example:
xw.WriteElementString("mqn", "qtwo-node", "http://testdemo/morenames", "Some value")
This generates an element containing both the namespace prefix and the declaration of that prefix, placing this element into that namespace without changing the default namespace:
<mqn:qtwo-node xmlns:mqn="http://testdemo/morenames">Some value</mqn:qtwo-node>
The example page shown in Figure 2 contains the results of writing the XML document. In that figure, the document occupies multiple lines separated by carriage returns and is indented to make it easy to see what it contains. However, this is not the default format for the output from an XmlWriter. Unless you like to sit and admire your elegantly formatted XML documents in a text editor, the fact that they contain non-significant white-space (such as multiple spaces, tabs and carriage returns) between each element is irrelevant. In effect, XML is a stream of characters, with the data delimited by the node tags and a single space between each attribute.
The reason that Figure 1 shows the XML with carriage returns and indents is because we set these two options using the checkboxes at the top of the page - making it easy to see what the XML contains. Without it, the XML is displayed on a single line that scrolls off to the right of the browser window.
The XmlWriterSettings class exposes four properties that you can use to specify the formatting of the XML that is generated by the XmlWriter(s) you create from it. You can experiment with these in the example page - Figure 3 shows the effects of the following settings:
' indent the output and insert line breaks
ws.Indent = True
' start each attribute on a new line
ws.NewLineOnAttributes = True
' use a Tab for indents instead of the default two spaces
ws.IndentChars = ControlChars.Tab
' use two Return characters instead of one
ws.NewLineChars = ControlChars.CrLf & ControlChars.CrLf

Figure 3 - Applying Indenting and Custom NewLine Characters with an XmlWriterSettings Instance
You can also control whether the <?xml version="1.0"?> declaration is generated - in some cases you may want to omit this. For example, if you are generating output from multiple writers to create a compound document, you will only want the declaration to appear once at the start of the document. The example page contains a checkbox where you can set the OmitXmlDeclaration property of the XmlReaderSettings instance to True to see this in action.
As well as generating well-formed XML documents, you can use the XmlWriter to create fragments that are not - on their own- well-formed XML. This simply involves setting the ConformanceLevel property of the XmlWriterSettings instance to ConformanceLevel.Fragment:
ws.ConformanceLevel = ConformanceLevel.Fragment
However, there is an issue to be aware of in this case. You cannot call the WriteStartDocument or WriteEndDocument method when you use ConformanceLevel.Fragment. And, as the WriteStartDocument method generates the XML declaration, this means that you will not get <?xml version="1.0"?> at the start of the output (you would not include this anyway if the fragment is not a well-formed document). The code in our example page checks the value of the ConformanceLevel property, and does not call WriteStartDocument or WriteEndDocument if a fragment is being generated. However, it does add a second element at the root level of the document to prove that you can create fragments that are not well-formed documents:
If ws.ConformanceLevel <> ConformanceLevel.Fragment Then
' cannot call WriteStartDocument for an XML fragment
xw.WriteStartDocument(True)
End If
... write contents of document here as before ...
If ws.ConformanceLevel = ConformanceLevel.Fragment Then
' add a second root node, this is illegal in a valid document
xw.WriteElementString("another-root-node", "This is a now a fragment")
Else
' cannot call WriteEndDocument for an XML fragment
xw.WriteEndDocument()
End If
Figure 4 shows the results. You can see the fragment with its two root nodes, and there is no XML declaration at the start:

Figure 4 - Creating an XML Fragment with an XmlWriterSettings and XmlWriter
The final feature that our example page demonstrates is how you can set the encoding of the XML documents you create with the XmlWriterSettings and XmlWriter classes. The default encoding for XML document created this way is UTF-8. The XmlWriter uses an instance of the UTF8Encoding class to encode the output, so that it is suitable for use in almost all XML parsers, and in Web Services and other applications.
However, you can use the Encoding property of the XmlWriterSettings class to specify an alternative encoding if you wish. The drop-down list in the example page contains six values: ASCII, UTF7, UTF8, UTF32, Unicode (equivalent to UTF-16), and BigEndianUnicode. These correspond to the encoding classes available in the .NET Framework, and the code in the page applies the one you select (when you also set the checkbox in the page) using the static properties of the Encoding class:
If chkEncoding.Checked Then
Select Case lstEncoding.SelectedItem.Text
Case "ASCII"
ws.Encoding = Encoding.ASCII
Case "UTF7"
ws.Encoding = Encoding.UTF7
Case "UTF8"
ws.Encoding = Encoding.UTF8
Case "UTF32"
ws.Encoding = Encoding.UTF32
Case "Unicode"
ws.Encoding = Encoding.Unicode
Case "BigEndianUnicode"
ws.Encoding = Encoding.BigEndianUnicode
End Select
End If
If you run the example, and try different values, you'll see the encoding in the opening XML declaration. Notice that, because the Web browser automatically translates most encodings into the same visual output (see Figure 5), you don’t see any other difference except where you select "UTF7" - which the browser cannot translate!

Figure 5 - Specifying the Encoding with an XmlWriterSettings and XmlWriter
The XmlReader is really just a pull-model parser that exposes the XML as a stream. Meanwhile, the XmlWriter is an object that converts a stream into another format or persists it to another object. For example, an XmlReader can take its input from a disk file, a stream or another reader instance; while the XmlWriter can generate its output as a disk file, a stream, a StringBuilder or another writer instance. This means that you can link the XmlReader and XmlWriter together so that, as you read nodes from the XmlReader, you generate the output you require through the XmlWriter.
Why is this useful? Well, one particular case is when working with very large XML documents that you don’t want to load into memory using a document store object such as XmlDocument. It's also very efficient, which is useful if - for example - you just need to modify the occasional value in the XML or perform some kind of transformation process or business logic. Of course, taking this to its logical conclusion, the XmlReader and XmlWriter can also be used as the input and output vehicles for other classes such as XslCompiledTransform, in which case you hand off the actual processing of the XML to another object instead. In the next section, you'll see an example of streaming XML, along with some of the other useful new features of the XmlWriter class.
The second example we provide for this article, named readerwriter.aspx, demonstrates streaming and processing XML using an XmlReader and an XmlWriter.It shows how you can apply business rules to create XML that complies with a specific format; as well as using several other features of the XmlWriterSettings class and the System.Xml classes as a whole:
The example page provides four options to demonstrate some of the different ways you can stream output through an XmlReader and XmlWriter. As you can see in Figure 6, these options allow you to generate the output to a StringBuilder, a MemoryStream, the ASP.NET Response, or through a pipelined XmlTextReader to a disk file. There is a link you can use to see the original XML document, and the results of the streaming process are shown at the bottom of the page.

Figure 6 - The Example Page that Streams XML from an XmlReader to an XmlWriter
If you want to generate XML as a String that you can then process, pass to another routine or just output to the user, you can take advantage of the overload of the static Create method for XmlWriter that accepts a StringBuilder. As the writer generates output, it is appended to the StringBuilder, and you can subsequently extract the complete XML document as a String by calling the ToString method of the StringBuilder.
The relevant code we use in the example shown in Figure 6 is listed below. Variables to hold instances of the XmlReader and XmlWriter are declared first, followed by the new StringBuilder that will receive the output. We also create a new MemoryStream instance and a variable to hold an XmlTextWriter. These last two are used to hold the XML when streaming to other types of output - as you'll see shortly.
' declare a variable to hold an XmlReader
Dim xr As XmlReader = Nothing
' declare a variable to hold an XmlWriter
Dim xw As XmlWriter = Nothing
' create a StringBuilder to hold the results
Dim builder As New StringBuilder()
' create a MemoryStream to hold the results
Dim memStream As New MemoryStream()
' declare an XmlTextWriter to pipeline results through
Dim pipedWriter As XmlTextWriter = Nothing
Then the code creates an XmlReaderSettings instance, sets the required properties, and from it generates an XmlReader over the input XML document. Any errors are reported in a separate StringBuilder, as you saw in the previous example:
' create an XmlReaderSettings instance and set some properties
Dim rs As New XmlReaderSettings()
rs.CheckCharacters = True
rs.CloseInput = True
rs.IgnoreWhitespace = True
rs.IgnoreProcessingInstructions = True
rs.ValidationType = ValidationType.None
Try
' create the XmlReader using the XmlReaderSettings instance
Dim sInPath As String = Server.MapPath("data/slides-to-stream.xml")
xr = XmlReader.Create(sInPath, rs)
Catch ex As Exception
messages.Append("<p><b>ERROR creating XmlReader:</b><br />")
messages.Append("Message = " & ex.Message & "</p>")
Label1.Text &= messages.ToString()
Return
End Try
Now an XmlWriterSettings instance is created, and used to generate a new XmlWriter. Notice that the CloseOutput property of the XmlWriterSettings instance is set to False. If the output is being sent to a MemoryStream, and the XmlWriter closes it automatically, we won’t be able to access the results. You can also see that we specify the encoding UTF-8 for the output. The source document is encoded as ASCII - it contains the attribute encoding="us-ascii" in the XML declaration as you will see if you click the link in the example page to view the XML document. However, by streaming it through an XmlReader and an XmlWriter we can change the encoding as required.
The actual type of output object depends on the selection in the option buttons, as you saw in Figure 6. This output object can be (in our example) the StringBuilder or the MemoryStream we created earlier, the ASP.NET Response.OutputStream (which is, of course, an object that inherits from Stream), or in the final case a new XmlTextWriter that points to a disk file (an example of wrapping or pipelining with the XmlWriter):
' create an XmlWriterSettings instance and set some properties
Dim ws As New XmlWriterSettings()
ws.CheckCharacters = True
ws.Indent = True
ws.Encoding = Encoding.UTF8
' do not close output automatically so that MemoryStream
' can be read after the XmlWriter has been closed
ws.CloseOutput = False
Try
' create the XmlWriter using the XmlWriterSettings instance
Select Case lstMethod.SelectedValue
Case 1 ' send output to a StringBuilder
xw = XmlWriter.Create(builder, ws)
Case 2 ' send output to a MemoryStream
xw = XmlWriter.Create(memStream, ws)
Case 3 ' send output to the ASP.NET Response
xw = XmlWriter.Create(Response.OutputStream, ws)
Case 4
' create an XmlTextWriter to wrap and pipeline the results through
Dim sOutPath As String = Server.MapPath("output/streamed.xml")
pipedWriter = New XmlTextWriter(sOutPath, Nothing)
pipedWriter.Formatting = Formatting.Indented
xw = XmlWriter.Create(pipedWriter, ws)
End Select
Catch ex As Exception
messages.Append("<p><b>ERROR creating XmlWriter:</b><br />")
messages.Append("Message = " & ex.Message & "</p>")
Label1.Text &= messages.ToString()
Return
End Try
The code next streams the XML from the XmlReader to the XmlWriter, generating the output to the appropriate object depending on the selection you make in the page. We'll come back and look at this process after we see how the output is captured and displayed.
If the output object is a StringBuilder, the content can be extracted using the ToString method and displayed in a Label control on the page. If the output object is a MemoryStream, the code generates a StreamReader over this, sets the position to the start of the stream (the XmlWriter leaves the current position at the end of the stream), and then uses the ReadToEnd method to extract the content for display in the Label control in the page. The MemoryStream can then be closed:
...
... create the output here through the XmlReader and XmlWriter
...
Select Case lstMethod.SelectedValue
Case 1
' writing to StringBuilder so just display the results in the page
messages.Append("<b>Contents of the StringBuilder:</b>")
Label2.Text &= "<pre>" & Server.HtmlEncode(builder.ToString()) & "</pre>"
Case 2
' writing to MemoryStream so extract XML document for display
messages.Append("<b>Contents of the MemoryStream:</b>")
Dim sw As New StreamReader(memStream)
memStream.Position = 0
Label2.Text &= "<pre>" & Server.HtmlEncode(sw.ReadToEnd()) & "</pre>"
' remember to close the MemoryStream after use - it is not
' closed automatically because settings.CloseOutput = False
memStream.Close()
...
The MemoryStream gives the same output in the page as when using a StringBuilder, but with one important exception. When you send the output to a StringBuilder it is always encoded as Unicode (UTF-16). This is the only character encoding used in the .NET Framework for String values, and so you see the attribute encoding="utf-16" in the opening XML declaration (look back at Figure 6). When using a MemoryStream, however, the stream encoding is set by the XmlWriter and so you see the attribute encoding="utf-8" in the opening XML declaration encoding - which corresponds with the encoding we specified in the XmlWriterSettings instance.
If the output object is the ASP.NET response, the code has nothing else to do. The output will have been sent to the ASP.NET Response.OutputStream, and is displayed right at the top of the page (see Figure 7). Of course, only the text content of the elements is visible, because the browser ignores the XML elements, so a message indicating how to see it all is displayed instead:
...
Case 3
' output dumped into Response so display message
messages.Append("<b>XML output dumped into ASP.NET Response.OutputStream</b><br />")
messages.Append("View the source of the page in your browser to see it...")
...
But if you design your streaming transformation to generate HTML, you can create the whole page this way - without having to learn XSL-T! Or as a more likely scenario, creating Web Services, you could generate the output XML document this way without having to first persist it to another object.

Figure 7 - Streaming the XML to the ASP.NET Response.OutputStream
The final option in the example page sends the XML output to a wrapped or pipelined XmlTextReader. One important point to note here is that, because we set the CloseOutput property of the XmlWriterSettings instance to False, the XmlTextWriter will be left open after the XmlWriter is closed. We only did this so that we could access the MemoryStream output object, and normally your will set the CloseOutput property of the XmlWriterSettings instance to True unless you intend to write other output to it - perhaps when using more than one XmlWriter to build compound document.
So the first step is to close the wrapped XmlTextWriter, and then the code can read the contents of the disk file that it created and display this in a Label control on the page:
...
Case 4
' writing to disk with pipelined XmlTextWriter
' must remember to close the XmlTextWriter - it is not
' closed automatically because settings.CloseOutput = False
pipedWriter.Close()
' now read and display contents of new XML disk file
Dim sXML As String = File.ReadAllText(sOutPath)
Label2.Text &= "<pre>" & Server.HtmlEncode(sXML) & "</pre>"
...
If you select this option and view the page or open the disk file (named streamed.xml in the output subfolder), you'll see that this time there is no encoding attribute in the XML declaration. The XmlWriter specified the encoding as UTF-8, and so this is the encoding used for the XML sent to the XmlTextWriter. However, we didn't set the encoding for the output of the XmlTextWriter, so there is no encoding specified in the new document. Figure 8 shown the result when the "Pipelined XmlWriter" option is selected.

Figure 8 - Streaming the XML through a Pipelined XmlTextReader
Although not directly concerned with the topics of this article, we couldn't resist adding a bonus feature to the example page. In most cases, you'll need to provide a schema for the XML documents you use in your applications, and an easy way to create one is by inferring it from an XML document. This can be done using the Visual Studio IDE (load an XML document and select Create Schema from the XML menu), but you can also do it using the new XmlSchemaInference class in System.Xml version 2.0. The example page uses the following code to create an XML schema when you select the fourth option, and (see Figure 8) displays a link to view the new schema:
...
' can now use the XmlInferenceClass to create an XML Schema
' create a new XmlReader over the XML document just created
' can use same XmlReaderSettings as when reading original XML
Dim reader As XmlReader = XmlReader.Create(sOutPath, rs)
Dim infer As New XmlSchemaInference()
infer.Occurrence = XmlSchemaInference.InferenceOption.Restricted
infer.TypeInference = XmlSchemaInference.InferenceOption.Restricted
Dim ss As XmlSchemaSet = infer.InferSchema(reader)
reader.Close()
' extract the new schema - first need to get an array of schemas
' that match the namespace in the document, then extract first one
Dim sNamespace As String = "http://myns/slidesdemo/streamed"
Dim sa As ArrayList = CType(ss.Schemas(sNamespace), ArrayList)
Dim sch As XmlSchema = CType(sa(0), XmlSchema)
' create a StreamWriter and write the schema to a disk file
Dim sr As New StreamWriter(Server.MapPath("output/streamed.xsd"))
sch.Write(sr)
sr.Close()
End Select
If you view the schema, you'll see that it has the correct data types for the elements. For example, the reviewed element is defined as type xs:dateTime, the usage-cost element as of type xs:decimal, and the position attribute as of type xs:unsignedByte.
Just as you can read values as the equivalent native CLR types from an XML document using an XmlReader, so you can write content to an XML document though an XmlWriter using CLR typed values. The WriteValue method has overloads that accept all of the simple CLR data types such as String, Double, Int32, Boolean and DateTime, and generates the appropriate XML content to represent the value for this typed instance.
We used the WriteValue method in the first example we showed in this article, but the values passed to it were all Strings and so it was not obvious what effect this method has. However, the second example that you've just seen (which writes data to a StringBuilder, MemoryStream , the ASP.NET Response, or a disk file) does demonstrate the effects of the WriteValue method method with other data types.
To understand what the example does in the way of processing the XML document as it streams it to the selected output object, this is the original (input) XML document named slides-to-stream.xml:
<?xml version="1.0" encoding="us-ascii" standalone="yes"?>
<root>
<session name="All about XML">
<slides>
<slide>
<title>Agenda</title>
<review-year>2004</review-year>
<review-month>05</review-month>
<review-day>10</review-day>
<usage-cost>$4.90</usage-cost>
</slide>
<slide>
<title>Introduction</title>
<review-year>2003</review-year>
<review-month>02</review-month>
<review-day>17</review-day>
<usage-cost>free</usage-cost>
</slide>
<slide>
<title>Code Examples</title>
<review-year>2004</review-year>
<review-month>12</review-month>
<review-day>19</review-day>
<usage-cost>$7.55</usage-cost>
</slide>
<slide>
<title>Summary</title>
<review-year>2005</review-year>
<review-month>01</review-month>
<review-day>22</review-day>
<usage-cost>unknown</usage-cost>
</slide>
</slides>
</session>
</root>
Notice that it contains no position attributes in the individual slide elements, and that the review date is held as three separate elements for the year, month and day. There is also a usage-cost element for each slide, whose values are not strictly numeric. Finally, the encoding is "us-ascii", and there is no namespace declaration.
The code in the example page processes this to add the namespace, change the encoding to UTF-8, and specify the review date as a single element (as used in the first example in this article). It also adds the position attribute to each slide element, and omits the enclosing slides element to simplify the structure. It also resolves the problem with the usage-cost elements in the original XML document by converting them to numeric values, assuming a default value of $7.50 if no value is available. This is the output document that is created (though you only see this encoding if you select the MemoryStream option, for the reasons discussed earlier):
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<root xmlns="http://myns/slidesdemo/streamed">
<session name="All about XML">
<slide position="1">
<title>Agenda</title>
<reviewed>2004-05-10T00:00:00</reviewed>
<usage-cost>4.9</usage-cost>
</slide>
<slide position="2">
<title>Introduction</title>
<reviewed>2003-02-17T00:00:00</reviewed>
<usage-cost>0</usage-cost>
</slide>
<slide position="3">
<title>Code Examples</title>
<reviewed>2004-12-19T00:00:00</reviewed>
<usage-cost>7.55</usage-cost>
</slide>
<slide position="4">
<title>Summary</title>
<reviewed>2005-01-22T00:00:00</reviewed>
<usage-cost>7.5</usage-cost>
</slide>
</session>
</root>
The following sections describe each step of the process, showing how we read values from the incoming XML and generate the appropriate values for the output.
The first step is to declare an integer variable to hold the value for each position attribute, and the write the opening XML declaration. The parameter False on this method indicates that this declaration should include the standalone="no" attribute. Then we create the root element, placing it in the namespace we require for the new document and adding this as the xmlns attribute.
Then we read up to the descendant node named session in the input document using the ReadToDescendant method, and start a session element in the output with the same value. To get all the attributes from the input session element onto the output element (in this case there is just one), we use the WriteAttributes method of the XmlWriter and pass to it the XmlReader that holds the input document. After this, we can start to loop through the remaining nodes in the input document, checking for any that are start elements using the IsStartElement method:
' declare variable to hold slide number
Dim iPosition As Int16 = 0
Try
' write opening <?xml...?> declaration to output
' note: changes the encoding to UTF-8 and removes the
' "standalone" attribute because we will generate a schema
xw.WriteStartDocument(False)
' create "root" element and add namespace declaration
Dim sNamespace As String = "http://myns/slidesdemo/streamed"
xw.WriteStartElement("root", sNamespace)
xw.WriteAttributeString("xmlns", sNamespace)
xr.ReadToDescendant("session")
' write the enclosing "session" element
xw.WriteStartElement("session")
xw.WriteAttributes(xr, True)
' read each node in the incoming XML
While xr.Read()
If xr.IsStartElement() Then
...
As each start element is found, a Select Case construct is used to decide how to process it. For a slide element in the input stream we write the opening tag for a new slide element to the output stream, and then increment the value of the iPosition variable (which starts as zero) and use it to add a position attribute. We could have used the WriteAttributeString method here, rather than WriteValue, which would have required less code and actually produced the same result. But this is only because the ToString method of an integer value produces the same as the WriteValue method as long as it is a valid integer value:
...
Select Case xr.Name
Case "slide"
' write new "slide" element and add a "position" attribute
' of type integer containing the current slide number
xw.WriteStartElement("slide")
xw.WriteStartAttribute("position")
iPosition += 1
xw.WriteValue(iPosition)
xw.WriteEndAttribute()
...
If the current element in the input is a title element, the code just reads the value of the existing title element (from the child text node of the element) and creates a title element with this value in the output. The XmlWriter has a method WriteNodes that can be used to copy an entire element, including all its child elements and other content, to the output. However, the source document is in a different namespace from the output document (in fact it has no namespace declaration), and so this would automatically add the xmlns attribute to change the namespace - something we don't want to happen:
...
Case "title"
' cannot just copy node into new document because the
' namespaces are different, so read value and create
' the correct element with no namespace attribute
xr.Read()
xw.WriteElementString("title", xr.Value)
...
If the current input element is a review-year element, we have a bit more work to do. We have to extract the three values from this element and its two following sibling elements named review-month and review-day, and use these values to generate a single reviewed element in the output. The following code shows how we read and store the value of the current element (review-year), then use the ReadToFollowing method of the XmlReader to move to the review-month and review-day elements and read and store their values. We can't use the ReadToNextSibling method here, because after reading the value of each element the reader is positioned on the text child node of that element. Once the code has collected the year, month and day values it attempts to create a new DateTime instance from them and generate the appropriate element in the output. If the values cannot be converted, an empty element is written to the output instead:
...
Case "review-year"
' collect values for year, month and day and
' create new "reviewed" element as a datetime type
Dim dReviewDate As DateTime
xr.Read()
Dim sYear As String = xr.Value
xr.ReadToFollowing("review-month")
xr.Read()
Dim sMonth As String = xr.Value
xr.ReadToFollowing("review-day")
xr.Read()
Dim sDay As String = xr.Value
Try
dReviewDate = New DateTime(Int32.Parse(sYear), Int32.Parse(sMonth), _
Int32.Parse(sDay))
xw.WriteStartElement("reviewed")
xw.WriteValue(dReviewDate)
xw.WriteEndElement()
Catch
' error converting values so write empty element
xw.WriteElementString("reviewed", "")
End Try
...
This section of code demonstrates the usefulness of the WriteValue method in that it automatically converts the value into the correct string representation within the XML output - there is no need to format the DateTime instance first. For example, the output generated for the first reviewed element is:
<reviewed>2004-05-10T00:00:00</reviewed>
Another example of the usefulness of the WriteValue method comes where the input element is a usage-cost element. As you saw in the listing of the original XML document, this can be a currency value (which may includes the $ currency character) or some other text string such as "free" or "unknown". To process this, we first declare a Double with value zero to hold the output value, and then check for the string "free" (in any letter case). If this is not found, we remove any leading "$" character and then attempt to parse the remaining string into a Double type. We assume the default value of 7.5 for the output if the conversion fails. Then it's simply a matter of creating the appropriate element in the output - again using the WriteValue method. And, as this is the last child element within each slide element, we can close the current slide element by calling the WriteEndElement method of the XmlWriter:
...
Case "usage-cost"
' collect value for cost, see if "free", and
' create new "usage-cost" element as a double type
xr.Read()
Dim sCost As String = xr.Value
Dim iCost As Double = 0
If String.Compare(sCost, "FREE", True) <> 0 Then
If sCost.Substring(0, 1) = "$" Then
sCost = sCost.Substring(1)
End If
Try
iCost = Double.Parse(sCost)
Catch
' error converting value so assume default cost
iCost = 7.5
End Try
End If
xw.WriteStartElement("usage-cost")
xw.WriteValue(iCost)
xw.WriteEndElement()
' and close the current "slide" element
xw.WriteEndElement()
End Select
...
If you look back at the input and output documents, you'll see the effects of this "business rule" - as summarized in Table 1 below.
|
Input element |
Output element |
|
<usage-cost>$4.90</usage-cost> |
<usage-cost>4.9</usage-cost> |
|
<usage-cost>free</usage-cost> |
<usage-cost>0</usage-cost> |
|
<usage-cost>$7.55</usage-cost> |
<usage-cost>7.55</usage-cost> |
|
<usage-cost>unknown</usage-cost> |
<usage-cost>7.50</usage-cost> |
Table 1 - The results of processing the usage-cost elements in the input document
After all the slide elements have been processed, the code completes the document by closing the "open" elements. In fact, simply calling WriteEndDocument will automatically write the appropriate end tags, but doing it yourself improves readability and helps to trap errors. If you expect an element to still be open, and hence call WriteEndDocument too many times, the error will help to indicate a problem in your code:
...
End If
End While
xw.WriteEndElement() ' "session"
xw.WriteEndElement() ' "root"
xw.WriteEndDocument()
Catch ex As Exception
' error reading document so display details
messages.Append("<p><b>ERROR reading XML document:</b><br />")
messages.Append("Message = " & ex.Message & "</p>")
Finally
Try
xr.Close()
xw.Close()
Catch
End Try
End Try
The final section of code shown here displays any errors when processing the document and generating the output, closes the XmlReader and XmlWriter, and we're done!
In this, the second of a series of three articles that discuss techniques for reading and writing XML in version 2.0 of the .NET Framework, we've looked at how the new XmlWriterSettings class can be used to generate instances of the XmlWriter class with specific behavior and settings. We saw how you can control the format of the output by indenting it, omitting the XML declaration, and changing the default encoding. We also demonstrated how you can use the XmlWriter to generate fragments of XML that are not themselves well-formed XML documents.
We also reviewed the techniques for using the various "write" methods of the XmlReader class to generate XML documents and fragments, and concentrated particularly on how you generate the appropriate namespace URI and prefix declarations that are required in most of the XML documents you create. While the techniques are much the same as in version 1.x, there is one new overload of the WriteElementString method that makes it easier to generate elements that contain both a namespace prefix and the declaration of that namespace.
Then we examined how you can stream XML by combining an XmlReader and an XmlWriter, and (if required) process the XML as it is read and written. The many useful features of the XmlReader class that you saw in the previous article, combined with the techniques available in the XmlWriter class for controlling the output, mean that this is a useful way to perform processing of large documents, change the encoding of a document, or simply stream it from one input type such as a disk file to another such as a StringBuilder or MemoryStream.
Then in the latter sections of the article, we looked at how the XmlWriter class in version 2.0 makes it easier to create XML documents that contain the correct formatting for CLR typed values, by using the new WriteValue method. This accepts any of the CLR value types as input, and converts them to the appropriate XML-compliant text string that represents the value. We even sneaked in some bonus content by showing how you can infer an schema fro an XML document using the new XmlSchemaInference class!
In summary, the topics we covered were:
In the next and final article in this series, we look at how the new XmlReader and XmlWriter classes, together with their "settings