©2005 Alex Homer, Stonebroom Limited – alex@stonebroom.com
This is the second in a series of three articles that look at how the latest version of Microsoft’s enterprise-level database, SQL Server 2005, now offers great support and close integration with XML as a data persistence format. This includes new ways to validate, store and query XML documents that are stored within the database. SQL Server 2005 provides native support for XML that can vastly improve application performance, while supporting robust and safe multi-user access to the data contained within the XML documents.
The topics we covered in the previous article were:
In this article, we'll look at:
Then, in the final article of the series, we'll be exploring:
Just because XML was originally conceived as a stream-based or a disk-based data persistence format doesn't mean that developers ignored their usual data stores. After all, an XML document is basically just a string of text characters, and so can be stored in a database using a char, varchar, or text type column.
Of course, this means that, when you want to use the XML, you have to read the complete document from the column. And if you want to edit and update it, you have to write the complete document back into the table again. If it's a large document, this can soak up network bandwidth, processing and server resources, and reduce application performance.
Another approach is to "shred" the document into its component parts and store each value from the document in a separate column. This has several advantages. One is that you can store each value in a column of the appropriate data type (int, varchar, datetime, etc.). This allows simple indexing of the document (something you can't easily do if you store the whole thing in a single column), and means that comparisons can be made based on the correct data type.
In other words, you can search for things like product IDs or names without having to extract the whole document and parse it each time. And you can select rows that meet specific conditions, such as dates in a particular range, by simply using a SQL WHERE clause on the appropriate column.
The problem with the "shredding" approach is the amount of developer effort required to build the data access code that performs all these tasks, including validating the document, extracting each value, and interfacing with the database to persist and extract values and complete documents as required. As they say in TV commercials, "there has to be a better way!" The better way is, of course, to use a database such as SQL Server 2005 that provides all these features built into the core database engine.
In the previous article, we saw how SQL Server 2005 provides several features that make persisting and managing XML document much easier and more efficient than the traditional approaches we've just been discussing. The core features are:
We described these features in the previous article, and provided a simple application that you can use to experiment with XQuery in SQL Server 2005. In this and the subsequent article, we'll use the techniques outlined above to demonstrate how you can save time and effort, and get better application performance, when using XML in SQL Server 2005.
As well as the XQuery Tester application described in the previous article, you can also use several other pages provided in the downloadable samples to see XQuery in action. The full list of these examples is:
We won't be listing all of the code in this article, because much of it uses standard ASP.NET and ADO.NET techniques to create the pages and connect to the database. What we will concentrate on is how the examples exploit the new XML data management features in SQL Server 2005. Many of the examples use an un-typed xml column so that the queries are easier to assimilate (no namespace declaration or prefixes are required). However, some processes (such as the modify method when replacing values in the XML) require a typed column, and so the script we provide creates two tables - one named Store with an un-typed xml column and one named StoreTyped with a typed xml column. You'll see both used in the examples.
Traditionally, developers who need to access values in an XML document stored in a database would read the complete document into an XML parser, and then navigate through the document to extract the values they need. The first example, "Reading XML Values the Traditional Way" demonstrates this approach. It uses a SQL statement that extracts the contents of the xml column named Demographics, for a specified CustomerID value:
Dim sql_getbank As String = "SELECT Demographics FROM Store " _
& "WHERE CustomerID = @CustomerID"
The page also uses an ASP.NET GridView control and a SqlDataSource control to display a list of customers, and selecting one executes a procedure named ShowBankName. This procedure connects to the database, creates a command and adds a parameter for the CustomerID to it, then calls the ExecuteScaler method of the command to get the XML from the matching row in the database:
Sub ShowBankName(ByVal sender As Object, ByVal e As EventArgs)
Dim sXML As String = String.Empty
Using con As New SqlConnection(ConfigurationManager.ConnectionStrings _
("XMLTestConnectionString").ConnectionString)
Dim cmd As New SqlCommand(sql_getbank, con)
cmd.Parameters.Add(New SqlParameter("@CustomerID", _
Int32.Parse(grid1.SelectedDataKey.Value)))
Try
con.Open()
sXML = cmd.ExecuteScalar()
con.Close()
Catch ex As Exception
lblMessage.Text = ex.Message
End Try
End Using
...
Now the code can check if an XML document is available, and if so load it into an XmlDocument instance and use the SelectSingleNode method to get a reference to the BankName element. The value of this element is in the first child node (a text node), so it can be extracted from the Value property of ChildNodes(0) and displayed in a Label control on the page:
...
If sXML <> String.Empty Then
Dim xd As New XmlDocument()
xd.LoadXml(sXML)
Dim xn As XmlNode = xd.SelectSingleNode("//BankName")
lblMessage.Text = "Bank name is: <b>" & xn.ChildNodes(0).Value & "</b>"
Else
lblMessage.Text = "Cannot find bank name"
End If
End Sub
The second example, "Using a Simple XQuery with an 'xml' Column", achieves the same result as the preceding example, but by using an XQuery instead. The SQL query used this time calls the query method of the Demographics column to access the BankName node, and uses the XPath data function to extract just the value (the content of the BankName element):
Dim sql_getbank As String = "SELECT Demographics.query('data(//BankName)') " _
& "FROM Store WHERE CustomerID = @CustomerID"
Executing this query simply returns the bank name as a String, and so the ShowBankName routine in this example - after opening the connection to the database and executing the query - simply displays the value that is returned:
lblMessage.Text = "Bank name is: <b>" & cmd.ExecuteScalar() & "</b>"
As you can see, this makes the code much simpler and more compact, and has the added benefit of reducing network traffic between the database and the data access code. Figure 1 shows the two example pages we've just described.

Figure 1 - Reading the bank name using both traditional techniques and XQuery
The third example we provide, "Comparing the Traditional Way and XQuery", does much the same as the two examples you've just seen, but this time extracts and displays the customer name and bank name for all 700+ rows in the Store and StoreTyped tables. It allows you to choose one of three ways to accomplish this - using the traditional approach, using XQuery against an un-typed xml column, and using XQuery against a typed xml column. The three queries it uses are:
Dim sql_trad_getbank As String = "SELECT CustomerName, Demographics FROM Store"
Dim sql_xquery_getbank As String = "SELECT CustomerName, " _
& "BankName = Demographics.query('data(//BankName)') FROM Store"
Dim sql_xquery_getbank_typed As String = "SELECT CustomerName, " _
& "BankName = Demographics.query('declare namespace x=""http://testschemas/StoreSurvey""; " _
& "data(//x:BankName)') FROM StoreTyped"
The code to extract the rows records the current time, executes a command using the appropriate query, and then iterates through the rows adding the customer name and bank name to a StringBuilder. Once all the rows have been processed, the number of elapsed milliseconds is calculated, and this - together with the list of customer and bank names - is displayed in the page. This is the relevant section of code for the traditional (non-XQuery) approach:
Dim start As DateTime = DateTime.Now
con.Open()
reader = cmd.ExecuteReader(CommandBehavior.CloseConnection)
While reader.Read()
Dim xd As New XmlDocument()
xd.LoadXml(reader("Demographics"))
Dim xn As XmlNode = xd.SelectSingleNode("//BankName")
If Not xn Is Nothing Then
builder.Append("Customer '" & reader("CustomerName") _
& "' banks at " & xn.ChildNodes(0).Value & "<br />")
End If
End While
Dim span As TimeSpan = DateTime.Now.Subtract(start)
builder.Insert(0, String.Format("<p>Data access took {0} milliseconds</p>", _
span.Milliseconds.ToString()))
lblMessage.Text = builder.ToString()
The second and third options, executing an XQuery, use similar but much more compact code:
Dim start As DateTime = DateTime.Now
con.Open()
reader = cmd.ExecuteReader(CommandBehavior.CloseConnection)
While reader.Read()
builder.Append("Customer '" & reader("CustomerName") _
& "' banks at " & reader("BankName") & "<br />")
End While
Dim span As TimeSpan = DateTime.Now.Subtract(start)
builder.Insert(0, String.Format("<p>Data access took {0} milliseconds</p>", _
span.Milliseconds.ToString()))
lblMessage.Text = builder.ToString()
Figure 2 shows the results of selecting the traditional approach, and you can see that - with the hardware we tested against - this took some 400 milliseconds to extract the data and process all the rows. This time includes that taken to manipulate the StringBuilder:

Figure 2 - Reading multiple rows using traditional XML data access techniques
However, as you can see in Figure 3, data access times are considerably improved when using XQuery. For an un-typed xml column, the time taken to extract and process the same number of rows is some 25% less - the average on our system was around 300 milliseconds. And even more striking is the performance when accessing a typed xml column. In this case the average access and processing time was around 220 milliseconds - a reduction against the traditional approach of some 45%. This is because SQL Server 2005 can shred the XML and store it in a more efficient manner, providing faster access when querying and when updating the content of XML documents.
And remember that this is with the database and Web server on the same machine, and includes the time for iterating the rows and building the display string in the StringBuilder. The reduced network bandwidth usage when accessing a remote SQL Server will enhance the differences even more, but the percentage reduction in the actual data access times is impressive even when these factors aren't taken into account.

Figure 3 - Reading the bank name using both traditional techniques and XQuery
The next example, "Using Parameters with an XQuery", demonstrates some of the ways that you can pass parameter values to an XQuery, depending on how you want to specify the results that are included in the rowset returned by the query. As we discussed in the previous article, you can access the value of a T-SQL variable from within an XQuery or XPath statement using the sql:variable function. The example provides four different queries that you can execute, including a stored procedure that executes an XQuery, and displays the results in an ASP.NET GridView control. Figure 4 shows the example page in action with the results of executing the stored procedure visible.

Figure 4 - Different ways of using parameters with an XQuery statement
The first option selects rows using traditional T-SQL techniques, where the parameter is used in the WHERE clause. The value method of the Demographics column is used to extract the value of the NumberEmployees element, and cast it to an int data type so it can be compared directly to the value of the @emp parameter (of type SqlDbType.Int) that is passed to the query by a SqlParameter added to the command. The SELECT clause contains an XQuery that extracts just the value of the BankName element from the rows that match the expression in the WHERE clause. This produces the same result as you see in Figure 4:
SELECT CustomerName,
BankName = Demographics.query('data(//BankName)')
FROM Store
WHERE Demographics.value('(//NumberEmployees)[1]', 'int') > @emp
The second statement attempts to replace the WHERE clause with a condition within the XQuery itself. The call to the query method of the Demographics column includes an XPath that selects the value of the BankName element, but only where the StoreSurvey root element has a NumberEmployees element containing a value that matches the parameter passed to the statement. Notice how the XPath number function is used to convert the data type of the NumberEmployees element with the data type of the parameter:
SELECT CustomerName,
BankName = Demographics.query('
data(/StoreSurvey[number(NumberEmployees[1]) > sql:variable("@emp")]/BankName)
')
FROM Store
However, this doesn't actually produce the result we want. If you run the example and select this option, you'll see that the rowset returned by the query contains all the rows from the table. The bank name is NULL in rows where the number of employees is less than the value in the parameter passed to the query. This is a common mistake, and demonstrates how you have to be careful when using XQuery to select rows.
The third statement solves this problem by combining the XQuery selection technique with the use of a WHERE clause. In this case, the exist method of the Demographics column is used to return either True or False, depending on whether there is a node in the XML document for this row that contains a NumberEmployees element with a value greater than the value of the @emp parameter (the syntax .[1] means "the first node of the current nodeset" - remember that the number function requires a single node, whereas the //NumberEmployees XPath statement returns a node collection even if there is only one node that matches). And, again, the number function is used to ensure that the data types of the element value and the parameter value match:
SELECT CustomerName,
BankName = Demographics.query('data(//BankName)')
FROM Store
WHERE Demographics.exist('//NumberEmployees[number(.[1]) > sql:variable("@emp")]') = 1
The final option is a statement that just executes a stored procedure named GetBankNamesByEmployee within the database. The TSQL used in this stored procedure is the same as that described above (for the third option). However, it does demonstrate that you can use XQuery in just the same way in a stored procedure as you do in declarative and parameterized SQL statements:
CREATE PROCEDURE dbo.GetBankNamesByEmployee
@emp int AS
SELECT CustomerName, BankName = Demographics.query('data(//BankName)')
FROM Store WHERE Demographics.exist('//NumberEmployees[. > sql:variable("@emp")]') = 1
Having seen how you can extract values from an XML document stored in SQL Server 2005, this next example changes direction somewhat by extracting an XML document of a specific format from the contents of an xml column, and then uses this XML to generate an HTML page by applying an XSL-T style sheet to the newly-generated XML. The SQL statement with its embedded XQuery looks like this:
SELECT Demographics.query('
<sales-summary>
<customer-name>{ sql:column("CustomerName") }</customer-name>
<sales-data>
<total-sales>{ data(//AnnualSales) }</total-sales>
<employees>{ data(//NumberEmployees) }</employees>
<sales-per-employee>
{ round(number((//AnnualSales)[1]) div number((//NumberEmployees)[1])) }
</sales-per-employee>
<store-size>{ data(//SquareFeet) }</store-size>
<sales-per-sqfoot>
{ round(number((//AnnualSales)[1]) div number((//SquareFeet)[1])) }
</sales-per-sqfoot>
</sales-data>
</sales-summary>
')
FROM Store WHERE CustomerID = @custid"
You can see that it creates an XML document with the root element sales-summary, and with two child elements. The first is a customer-name element, and this is followed by a sales-data element that itself contains five other items of sales-related information. Notice how the sales-per-employee and sales-per-sqfoot values can be calculated within the XQuery using the XPath number function and div operator. The result is then converted to a non-fractional number using the round function. The resulting XML document will look something like this:
<sales-summary>
<customer-name>Acceptable Sales & Service</customer-name>
<sales-data>
<total-sales>800000</total-sales>
<employees>12</employees>
<sales-per-employee>66667</sales-per-employee>
<store-size>7000</store-size>
<sales-per-sqfoot>114</sales-per-sqfoot>
</sales-data>
</sales-summary>
We then use a simple XSL-T style sheet to transform this XML into HTML. This is the style sheet:
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<b>
Annual sales figures for customer
'<xsl:apply-templates select="./sales-summary/customer-name" />'</b>
<p />
<xsl:apply-templates select="./sales-summary/sales-data" />
</body>
</html>
</xsl:template>
<xsl:template match="customer-name">
<xsl:value-of select="."/>
</xsl:template>
<xsl:template match="total-sales">
Your total annual sales are $<xsl:value-of select="."/><br />
</xsl:template>
<xsl:template match="employees">
The number of people you employee is <xsl:value-of select="."/><br />
</xsl:template>
<xsl:template match="sales-per-employee">
Your annual sales per employee are therefore $<xsl:value-of select="."/><br />
</xsl:template>
<xsl:template match="store-size">
Your store has a total square footage of <xsl:value-of select="."/><br />
</xsl:template>
<xsl:template match="sales-per-sqfoot">
This means that the annual sales per square foot are $<xsl:value-of select="."/><br />
</xsl:template>
</xsl:stylesheet>
You can see that it creates the html and body elements, and - within the body element - applies a template that inserts the customer name into the output. Then it applies templates that match the five other elements within the nested sales-data element to insert these values and the appropriate accompanying text into the output.
The relevant section of code in the ASP.NET page "Combining XQuery and XSL-T" that uses this XQuery and style sheet are shown next. The page uses an ASP.NET GridView control and a SqlDataSource control to display a list of customer names, as in the first two examples in this article. Selecting a row in the GridView executes a routine named ShowSalesReport. This routine opens a connection to the database, creates a command, adds a parameter to specify the selected customer, and executes the SQL statement you saw earlier.
The XML document returned from the query is accessed by calling the GetSqlXml method of the SqlDataReader to get a SqlXml instance, which exposes the CreateReader method that we can use to get an XmlReader over the XML document itself. We do this, rather than extracting the value as a String, so that we can pass the reader to the Transform method of the XslCompiledTransform class later in our code:
Sub ShowSalesReport(ByVal sender As Object, ByVal e As EventArgs)
Dim builder As New StringBuilder()
Dim xmlr As XmlReader
Using con As New SqlConnection(ConfigurationManager.ConnectionStrings _
("XMLTestConnectionString").ConnectionString)
Dim cmd As New SqlCommand(sql_select, con)
cmd.Parameters.Add(New SqlParameter("@custid", SqlDbType.Int))
cmd.Parameters("@custid").Value = Int32.Parse(grid1.SelectedDataKey.Value)
Try
con.Open()
Dim dr As SqlDataReader = cmd.ExecuteReader(CommandBehavior.CloseConnection)
dr.Read()
Dim oXML As SqlXml = dr.GetSqlXml(0)
xmlr = CType(oXML.CreateReader(), XmlReader)
Catch ex As Exception
lblMessage.Text = ex.Message
Return
End Try
...
The next step is to create an instance of the XslCompiledTransform class, and load the XSL-T style sheet - this is stored in the data subfolder of the samples. Then we create a StreamWriter that will create the output file, and call the Transform method to generate the new HTML document. The second parameter to the Transform method is a reference to an XsltArgumentList instance, but as we have to arguments to pass to the style sheet we use Nothing for this parameter.
...
Dim sPath As String = Path.Combine(Request.PhysicalApplicationPath, "data")
Dim transform As New XslCompiledTransform()
Try
transform.Load(Path.Combine(sPath, "sales-transform.xslt"))
Using writer As New StreamWriter(Path.Combine(sPath, "sales-report.htm"))
transform.Transform(xmlr, Nothing, writer)
End Using
lblMessage.Text = "Created sales report <a href='data/sales-report.htm' target='_blank'>" _
& "sales-report.htm</a>"
Catch ex As XsltException
lblMessage.Text = ex.Message
End Try
End Using
End Sub
Figure 5 shows this example page in action. You can see the hyperlink that is created after the new HTML file has been generated, and clicking on this hyperlink displays the HTML page in a new browser window.

Figure 5 - Creating an HTML document from an XML document generated through an XQuery
In the previous article, we discussed the great new features in SQL Server 2005 that make it easier to work with XML documents, and also allows you to write more efficient data access code when you need to persist, access and update XML. SQL Server 2005 is not the first database to provide these features, but the combination of these new features and integration with ADO.NET make it easy to take advantage of them, while vastly reducing the amount of code you have to write.
In this article, you saw this demonstrated in several examples. We looked at some simple techniques for extracting data from an xml column using XQuery, and then compared the performance of XQuery with the traditional approach of loading the entire document into an XML parser and extracting the required values. As you saw, XQuery reduces the bandwidth requirements by limiting the data returned over the network, removes the need to use an XML parser client-side, and dramatically reduces processing overhead and response times. And, when the XML document is stored in a typed xml column, the performance improvements are even more noticeable.
Then we looked at some different ways you can pass values to a query that contains an XQuery statement - pointing out one of the common pitfalls when selecting rows to be returned. Finally, we saw how you can use XQuery to create custom XML documents from the data stored in an xml column, and then apply a style sheet to the result to transform it into any other format you might require. Of course, XQuery isn't limited to generating simple scalar values and XML documents or fragments. You can use XQuery to generate any output you want, for example you could generate comma-separated or tab-separated data directly within your XQuery if this is the format that your applications require.
In the final article in this series, we'll delve deeper into XQuery and SQL Server 2005 XML features to see how we can improve performance when updating XML documents stored in the database, and how we can use XQuery within managed code stored procedures.
©2005 Alex Homer, Stonebroom Limited – alex@stonebroom.com