Alex Homer
(alex@stonebroom.com) ©2005
Stonebroom Limited,
Note: the code described in this article is
based on the Beta 1 release of the .NET Framework v 2.0. All the examples can
be downloaded from http://www.daveandal.net/articles/schema-api/, and you can also run many of them online from the same location.
Most data
access technologies provide a way for you to query the schema of a database,
and obtain information about the tables, stored procedures, data types, users
and other content of a database. However, the technique varies depending on the
data provider or database driver you are using, and is not always straight-forward
to implement. In ADO.NET 2.0, Microsoft has added a set of features to the
standard managed code classes that you can use as an API to query database
schemas. This article looks at how you can use these features in your
applications. It covers:
Querying
the schema of a database involves use of a single method that is added to the Connection class in ADO.NET 2.0. This method,
named GetSchema, uses the current connection to
retrieve information about the database to which that connection points. To
query across databases, you have to create a separate connection to each one,
and then query them individually using each connection.
The GetSchema method is implemented within the abstract base class DbConnection (in the System.Data.Common namespace), and is available in all
the provider-specific connection classes such as SqlConnection, OleDbConnection, OracleConnection and OdbcConnection. There are three overloads of the GetSchema method, the syntax of each one being shown in Table 1.
Table 1 - The
Overloads of the DbConnection.GetSchema Method
|
Method Overload |
Description |
|
GetSchema() |
Returns a DataTable containing a row for each of the
metadata collections that are available from the database. Equivalent to
specifying the String value "MetaDataCollections" when using the GetSchema(String)
overload. |
|
GetSchema(String) |
Takes the name of a metadata collection and
returns a DataTable containing a row for each item found in that
metadata collection in the database. |
|
GetSchema(String, String-Array) |
Takes the name of a metadata collection and
an array of String values that specify the restrictions on
which the rows in the returned DataTable will be filtered. Used to select
and return a DataTable containing only the rows in the metadata
collection specified in the first parameter that match the restrictions
specified in the second parameter. |
Obtaining a
list of the metadata collections that are available is easy, and gives you a
good feel for the types of information that you can obtain from your database.
The following code, in the Page_Load event handler of the ASP.NET
example page named get-collections-sql.aspx, first collects the connection
string from web.config and then creates a connection to
the database. Then it calls the GetSchema method to obtain a standard ADO.NET
DataTable
containing the list of metadata collections that are available. Notice that you
can use the GetSchema method with the String parameter value "MetaDataCollections", or with no parameter, to get the metadata
collection list:
Sub Page_Load()
' read connection string from web.config
Dim connect As String = ConfigurationSettings.ConnectionStrings( _
"nwind-sql").ConnectionString
' create a connection to the database
Using conn As New SqlConnection(connect)
Try
conn.Open()
' get DataTable with all available metadata collections
Dim dt As DataTable = conn.GetSchema("MetaDataCollections")
' or could just use: Dim dt As DataTable = conn.GetSchema()
' bind DataTable to GridView control to display rows
grid1.DataSource = dt
grid1.DataBind()
Catch ex As Exception
output.Text = "* ERROR: " & ex.Message
Finally
conn.Close()
End Try
End Using
End Sub
To display
the contents of the DataTable, the code binds it to a GridView control located in the HTML section
of the page. The listing below shows the GridView control declaration, and Figure 1
shows the results:
<asp:GridView id="grid1" runat="server"
BorderWidth="1px" ForeColor="Black" BackColor="White"
CellPadding="4" BorderStyle="None" BorderColor="#dedfde"
GridLines="Vertical" EnableViewstate="False">
<HeaderStyle ForeColor="White" Font-Bold="True" BackColor="#6b696b" />
<RowStyle BackColor="#f7f7de" />
<AlternatingRowStyle BackColor="White" />
<SelectedRowStyle ForeColor="White" Font-Bold="True" BackColor="#CE5D5A" />
</asp:GridView>

Figure 1 - Reading a
List of the Available Metadata Collections from SQL Server
It's
important to recognize that not all providers will return the same set of metadata
collections. For example, using the System.Data.OleDb provider, and the OleDbConnection class, the GetSchema method returns a much more
restricted list of collections, as shown in Figure 2.

Figure 2 - Reading a
List of the Available Metadata Collections using the OleDb
Classes
Meanwhile,
using the System.Data.Odbc.OdbcConnection class gives a different set of
collections again. You can see the list in Figure 3. All these examples use the
same Northwind sample database, running on SQL
Server 2005 ("

Figure 3 - Reading a
List of the Available Metadata Collections using the Odbc
Classes
The set of
example pages you can download for this article also contains two that query a
Microsoft Access (JET) database for lists of metadata collections. The results
from these, as shown in Figure 4, demonstrate again how the information
available differs depending not only on the type of database, but also the
provider you use to access it.

Figure 4 -
Irrespective
of which provider you use and which type of database you query using the Schema
API, there are some common factors that you can rely on. These make it easier
to write code that will work (and adapt itself if required) to the specific database
type and provider. Every provider and database combination should expose
collections that allow code to find out about the built-in objects and the
database engine itself:
Most of the
remaining collections allow code to enumerate the common objects that make up
the contents of the database, such as tables, columns, indexes, stored
procedures, etc. Notice that some providers expose a Tables and a Views metadata collection, while others
(OLE-DB in particular) just expose a Tables collection. In general, with most
of the providers, the Tables collection includes Views that are declared
within the database, and these are identified by a value in one of the columns
in the DataTable
that is returned by the GetSchema method.
Some
providers also allow you to enumerate the databases on the server, where you
are connecting to a database server that can support more than one database
(when querying an Access database file, for example, there is obviously only
one database available). For SQL Server, this collection is named Databases when queried via the SqlClient provider, and Catalogs when queried via the OleDb provider. However, the Odbc provider for SQL Server does not
expose this collection.
Before we
go on to look in more detail at using the GetSchema method, you might like to
experiment with an ASP.NET page that we provide in the samples for this
article. Named database-explorer.aspx, it allows you to see the list of
metadata collections for SQL Server using the three standard providers, and a
JET database through two providers. The lists of collections are, in fact, the
same as you saw in the earlier screenshots. However, you can select a metadata
collection and see the entire contents in a second GridView control on the same page.
Figure 5
shows the example page in action, with the list of provider and database
combinations at the top. Below this is the name and version of the selected
database, which are extracted from the DataSourceInformation metadata collection. This is
followed by the list of metadata collections that are available for the
selected SQL Server with the SqlClient provider (not all are visible in
the screenshot), with the Tables collection selected.

Figure 5 - Selecting a
Metadata Collection in the Database Explorer Example Page
Figure 6
shows the second GridView control in this example page, populated with
the contents of the Tables metadata collection (again, only part of the
list is visible in the screenshot). However, you can see that the information
available includes the database name (the table_catalog value), the name of the schema on
which the table is based (in this case they are all owned by the dbo
schema, but this is not always the case - especially in SQL Server 2005). There is, of course, the table name, and also
the table type. Notice that, as mentioned earlier, views as well as ordinary
(base) tables are included in the collection, and are differentiated by the
value in the table_type column.

Figure 6 - Part of the
Contents of the Tables Metadata Collection for the Northwind
Database
You will certainly
find it useful to browse through the collections on your machine
(unfortunately, we are not able to publish this example on our public Web site
for obvious security reasons). You'll probably be surprised at the amount of
detail that is available. If you don't have a database server on your local
machine, you can edit the connection strings in the web.config file to query a remote database elsewhere
on your network instead.
You saw a
brief example near the start of this article of using the Schema API to get a
list of the available metadata collections. In this section, you'll see in more
detail how you can extract specific sets of data from these collections, and
use it in your applications. As a simple example to start with, you saw in the
previous section how the database explorer page we provide displays the name
and version of the database being queried. This is easy to do. Assuming you
have already created a connection to the database, you just call the GetSchema method with the parameter "DataSourceInformation"
to get a DataTable containing a single row that describes the
database engine. Then you can pull the values out of the row, format them into
the required string, and display them. The listing below shows the code
required:
' get the database/server details and display them
Dim dt As DataTable = conn.GetSchema("DataSourceInformation")
output-label.Text = String.Format("{0} (version {1})", _
dt.Rows(0)("DataSourceProductName").ToString(), _
dt.Rows(0)("DataSourceProductVersion").ToString())
Figure 7 shows the resulting output from this code. You can,
of course, use the data base explorer example page to obtain the column names
(and see what data they contain). It will help you write code that takes
advantage of the Schema API.
![]()
Figure 7 - Displaying
the Database Engine Name and Version
So far,
you've seen how you can extract the contents of a metadata collection by
specifying the collection name in a call to the GetSchema method. However, there is an
overload of the GetSchema method that takes both a collection
name and a restriction array
(as mentioned in Table 1 earlier). The restriction array is simply an array of String values that are used to filter the
rows in the collection, so that you can effectively select only the information
you want.
Each
collection has a specific number of restrictions (i.e. fields or columns that
you can filter on), and you must provide a restriction array of the correct
size for the collection you specify when you call GetSchema. The number of restrictions is
contained the MetaDataCollections collection, as you can see by
referring back to Figures 1 to 4. For example, the number of restrictions for
the Tables
and Columns
collections is 4, while the Databases and Catalogs collections (where available) only
accept one restriction. Collections that accept zero restrictions do not allow
you to filter the rows that are returned - examples being the MetaDataCollections collection itself, and the DataSourceInformation and DataTypes collections.
More
details of the restrictions for each collection are available from the Restrictions collection, which should be
available for all provider/database combinations. You can query this collection
with the GetSchema method to see them all (you can’t
apply restrictions to this collection), using:
Dim dt As DataTable = conn.GetSchema("Restrictions")
Alternatively, you can use the Database Metadata Explorer sample we provide. Figure 8 shows the results for the SqlClient provider against a SQL Server 2005 database. The first column shows the metadata collection name, the second column shows the name of the restriction as used within the database (i.e. the name of the object is filters upon), the third column shows the name that defines the restriction (and is used as the column name of the returned DataTable), and the fourth column shows the order of the restrictions for each collection.

Figure 8 - Viewing the
Restrictions metadata collection in the Database Explorer example page
One other
item of information returned by the MetaDataCollections collection is the number of
identifier parts that are required to make up a unique restriction on each
metadata collection. For example, as you can see in Figures 1 to 4, the Tables metadata collection accepts four
restrictions, but only three of these (the value in the NumberOfIdentifierParts column) are required to specify a table
uniquely. This is because a table can only be of type BASE
TABLE or of type VIEW, and yet you cannot use the same
name for a table and a view in the same database. Therefore, the combination of
the database/catalog name, the owner/schema name and
the table name are sufficient to uniquely identify a table.
On the
other hand, the Columns collection has four restrictions and four
unique identifier parts. As you can see in Figure 8, the first three restrictions
are the same as for the Tables collection (i.e. the database/catalog name, the owner/schema name and the table name).
But these alone are not sufficient to uniquely identify a column in a table,
and so the fourth restriction (the column name) is also required. Hence the
number of identifier parts for this collection is four.
You can, of
course, filter on any of the restrictions, including those that are not part of
the set of unique identifiers. For example, when using the Tables metadata collection, you can
specify a restriction for the table type - perhaps to get a list of all views
or all base tables when you don’t specify a table name restriction.
Restrictions
are useful when you only want to select a specific item, for example to find
details about one table that you know exists in the database. As you can see in
Figures 1 to 4, you need to use a restrictions array of size four, and the
filters you can apply are the database (catalog)
name, the owner (or schema name), the table name, and the table type ("BASE
TABLE" or "VIEW"). Assuming you know the four
values, you can filter on all of them like this:
Dim restrictions(3) As String
restrictions(0) = "Northwind" ' database/catalog name
restrictions(1) = "dbo" ' owner/schema name
restrictions(2) = "Orders" ' table name
restrictions(3) = "BASE TABLE" ' table type
Using conn As New SqlConnection("your-connection-string")
Try
conn.Open()
grid1.DataSource = conn.GetSchema("Tables", restrictions)
grid1.DataBind()
Catch ex As Exception
output.Text = "* ERROR: " & ex.Message
Finally
conn.Close()
End Try
End Using
You can also use just one, or some, of the restrictions. Any array items that you leave set to Nothing (null in C#) will be ignored. For example, if you want all tables named Orders, irrespective of which owner/schema they belong to, you could use:
Dim restrictions(3) As String
restrictions(2) = "Orders" ' table name
restrictions(3) = "BASE TABLE" ' table type
And if you want to find either a table or a view with a specific name, you can omit the fourth restriction as well:
Dim restrictions(3) As String
restrictions(2) = "Orders" ' table name
As an example of using restrictions, the example page named get-tables.aspx demonstrates selecting in the Tables metadata collection with four restrictions, and then with a single restriction. The code uses a generic routine that accepts a connection string, a metadata collection name, and a restrictions array; and returns a DataTable populated from the specified collection. The routine, a function named MyGetSchemaRoutine, is shown in the next listing:
Private Function MyGetSchemaRoutine(sConnString As String, _
sCollectionName
As String, aRestrictArray As String()) As DataTable
' create a
connection to the database
Using conn As New SqlConnection(sConnString)
Try
conn.Open()
' return
DataTable with all metadata about Tables that
' match the
restrictions specified in the array
Return conn.GetSchema(sCollectionName, aRestrictArray)
Catch ex As Exception
output.Text =
"* ERROR: " & ex.Message
Return Nothing
Finally
conn.Close()
End Try
End Using
End Function
The code in the Page_Load event handler can then call this routine to get a DataSet based on the values passed to it as parameters. In the next listing, you can see that - after fetching the connection string for the web.config file - the code creates a restrictions array that specifies one table. Of course, not all of these restrictions are actually required to select a single table, as the connection itself defines the database that the information will be extracted from, and (as you saw earlier) only the first three restrictions are required to uniquely identify a table. Once the restrictions array is populated, the code calls the MyGetSchemaRoutine function and then displays the results in the first GridView control declared in the HTML section of the page:
Sub Page_Load()
' read connection string from web.config
Dim connect As String = ConfigurationSettings.ConnectionStrings( _
"nwind-sql").ConnectionString
' create a restrictions array to select specific rows
' Tables collection has 4 restrictions:
' table_catalog, table_schema, table_name, table_type
' null values indicate no restriction
Dim restrictions(3) As String
restrictions(0) = "Northwind"
restrictions(1) = "dbo"
restrictions(2) = "Orders"
restrictions(3) = "BASE TABLE"
' call routine to get DataTable of matching metadata rows
Dim table As DataTable = MyGetSchemaRoutine(connect, "Tables", restrictions)
' bind DataTable to first GridView control to display rows
grid1.DataSource = table
grid1.DataBind()
...
The code continues by clearing the restrictions array (setting all the items in it to null) and specifies just the table type restriction. Then it calls the MyGetSchemaRoutine function again, and displays the results in the second GridView control declared in the HTML section of the page:
...
' clear restrictions array
Array.Clear(restrictions, 0, restrictions.Length)
' specify new restriction to select:
' table_type="VIEW" (i.e. exclude BASE TABLES)
restrictions(3) = "VIEW"
' call routine to get DataTable of matching metadata rows
table = MyGetSchemaRoutine(connect, "Tables", restrictions)
' bind DataTable to second GridView control to display rows
grid2.DataSource = table
grid2.DataBind()
End Sub
The results of running this example are shown in Figure 9. So you can see how easy it is to get lists of tables from your database using the Schema API. And if all you want to do is check if a table exists, you just specify enough restrictions to uniquely select it in the Tables metadata collection, and then query the Count property of the Rows collection of the DataTable to see if it is equal to one (or greater than zero):
Dim restrictions(3) As String
restrictions(0) = "Northwind"
restrictions(1) = "dbo"
restrictions(2) = "Orders"
Dim dt As DataTable = conn.GetSchema("Tables")
If dt.Rows.Count = 1 Then
' table exists in database
Else
' table does not exist in database
End If

Figure 9 - Using
restrictions with the Tables metadata collection
As a final
example of using the Schema API, the example page named get-column-details.aspx shows a common scenario that you may come
across when building data management tools. If you want to know all about a
specific table in a database, and in particular details of the columns in the
table, you can use the Columns metadata collection. This example queries the Columns metadata collection for
columns in the Orders table of the Northwind database, and displays the details
in an attractive and readable format.
Obtaining
the information from the Columns metadata collection is easy enough - you just populate
restriction entries that specify the table, but not the column name. In other
words, you just require:
Dim restrictions(3) As String
restrictions(0) = "Northwind"
restrictions(1) = "dbo"
restrictions(2) = "Orders"
Then you
call the GetSchema method for the Columns metadata collection:
Dim dt As DataTable = conn.GetSchema("Columns", restrictions)
The awkward
part is converting the contents of the DataTable into an attractive display. We
chose to create an HTML table dynamically from the DataTable, adding style attributes to give
the appearance shown in Figure 10 (not all columns from the original Orders table in the database are visible
in this screenshot). The code to generate the HTML table is not listed here, as
it is not really the focus of this article, but you can download the samples
from our Web site at http://www.daveandal.net/articles/schema-api/ to view all the code.

Figure 10 - Extracting
and displaying table column information using the Schema API
The main
point of this example is to give you a chance to explore the details that are
available from the Columns metadata collection. You can see that the
columns are ordered alphabetically in the DataTable
containing the metadata, but
their ordinal position is contained in the metadata. Also visible is the default
value, nullability, data type, the maximum length for
character-based columns, the precision, scale and size for numeric columns, and
the character set used - amongst other useful items of information.