What's In Your Database?

The Schema API in ADO.NET 2.0

Alex Homer (alex@stonebroom.com)      ©2005 Stonebroom Limited, England

 

Note: the code described in this article is based on the Beta 1 release of the .NET Framework v 2.0. All the examples can be downloaded from http://www.daveandal.net/articles/schema-api/, and you can also run many of them online from the same location.

 

Most data access technologies provide a way for you to query the schema of a database, and obtain information about the tables, stored procedures, data types, users and other content of a database. However, the technique varies depending on the data provider or database driver you are using, and is not always straight-forward to implement. In ADO.NET 2.0, Microsoft has added a set of features to the standard managed code classes that you can use as an API to query database schemas. This article looks at how you can use these features in your applications. It covers:

 

  • Reading a list of metadata collections from SQL Server
  • Querying other types of database
  • Exploring the various metadata collections
  • Programming with the Schema API
  • Understanding metadata collection restrictions
  • Creating and using a restriction array

The GetSchema Method

Querying the schema of a database involves use of a single method that is added to the Connection class in ADO.NET 2.0. This method, named GetSchema, uses the current connection to retrieve information about the database to which that connection points. To query across databases, you have to create a separate connection to each one, and then query them individually using each connection.

 

The GetSchema method is implemented within the abstract base class DbConnection (in the System.Data.Common namespace), and is available in all the provider-specific connection classes such as SqlConnection, OleDbConnection, OracleConnection and OdbcConnection. There are three overloads of the GetSchema method, the syntax of each one being shown in Table 1.

 

Table 1 - The Overloads of the DbConnection.GetSchema Method

Method Overload

Description

GetSchema()

Returns a DataTable containing a row for each of the metadata collections that are available from the database. Equivalent to specifying the String value "MetaDataCollections" when using the GetSchema(String) overload.

GetSchema(String)

Takes the name of a metadata collection and returns a DataTable containing a row for each item found in that metadata collection in the database.

GetSchema(String, String-Array)

Takes the name of a metadata collection and an array of String values that specify the restrictions on which the rows in the returned DataTable will be filtered. Used to select and return a DataTable containing only the rows in the metadata collection specified in the first parameter that match the restrictions specified in the second parameter.

Reading a List of Metadata Collections from SQL Server

Obtaining a list of the metadata collections that are available is easy, and gives you a good feel for the types of information that you can obtain from your database. The following code, in the Page_Load event handler of the ASP.NET example page named get-collections-sql.aspx, first collects the connection string from web.config and then creates a connection to the database. Then it calls the GetSchema method to obtain a standard ADO.NET DataTable containing the list of metadata collections that are available. Notice that you can use the GetSchema method with the String parameter value "MetaDataCollections", or with no parameter, to get the metadata collection list:

 

Sub Page_Load()

 

  ' read connection string from web.config

  Dim connect As String = ConfigurationSettings.ConnectionStrings( _

                                                "nwind-sql").ConnectionString

 

  ' create a connection to the database

  Using conn As New SqlConnection(connect)

    Try

      conn.Open()

 

      ' get DataTable with all available metadata collections

      Dim dt As DataTable = conn.GetSchema("MetaDataCollections")

      ' or could just use:  Dim dt As DataTable = conn.GetSchema()

 

      ' bind DataTable to GridView control to display rows

      grid1.DataSource = dt

      grid1.DataBind()

 

    Catch ex As Exception

      output.Text = "* ERROR: " & ex.Message

    Finally

      conn.Close()

    End Try

  End Using

 

End Sub

 

 

To display the contents of the DataTable, the code binds it to a GridView control located in the HTML section of the page. The listing below shows the GridView control declaration, and Figure 1 shows the results:

 

<asp:GridView id="grid1" runat="server"

     BorderWidth="1px" ForeColor="Black" BackColor="White"

     CellPadding="4" BorderStyle="None" BorderColor="#dedfde"

     GridLines="Vertical" EnableViewstate="False">

    <HeaderStyle ForeColor="White" Font-Bold="True" BackColor="#6b696b" />

    <RowStyle BackColor="#f7f7de" />

    <AlternatingRowStyle BackColor="White" />

    <SelectedRowStyle ForeColor="White" Font-Bold="True" BackColor="#CE5D5A" />

</asp:GridView>

 

Figure 1 - Reading a List of the Available Metadata Collections from SQL Server

Querying Other Types of Database

It's important to recognize that not all providers will return the same set of metadata collections. For example, using the System.Data.OleDb provider, and the OleDbConnection class, the GetSchema method returns a much more restricted list of collections, as shown in Figure 2.

 

Figure 2 - Reading a List of the Available Metadata Collections using the OleDb Classes

 

Meanwhile, using the System.Data.Odbc.OdbcConnection class gives a different set of collections again. You can see the list in Figure 3. All these examples use the same Northwind sample database, running on SQL Server 2005 ("Yukon") Beta 2 on Windows Server 2003. So you must be careful when writing code that uses the Schema API to ensure that you access only collections that are available through the specific data provider you are using.

 

Figure 3 - Reading a List of the Available Metadata Collections using the Odbc Classes

Querying a Microsoft Access Database

The set of example pages you can download for this article also contains two that query a Microsoft Access (JET) database for lists of metadata collections. The results from these, as shown in Figure 4, demonstrate again how the information available differs depending not only on the type of database, but also the provider you use to access it.

 

Figure 4 - Reading Lists of Metadata Collections from a JET Database using the Odbc and OleDb Classes

Exploring the Various Metadata Collections

Irrespective of which provider you use and which type of database you query using the Schema API, there are some common factors that you can rely on. These make it easier to write code that will work (and adapt itself if required) to the specific database type and provider. Every provider and database combination should expose collections that allow code to find out about the built-in objects and the database engine itself:

 

  • A collection named MetaDataCollections that contains a list of all the available collections, and the number of restrictions that apply to each collection. If in doubt, you can use the GetSchema method with no parameters to get the default collection, which should be the MetaDataCollections collection.

 

  • A collection named DataSourceInformation, containing a single row that provides the name, version and other details about the database. These details include the separation character for multi-part object identifiers (such as Northwind.dbo.Orders), the legal characters and format for string literals, parameters, identifiers, etc.

 

  • A collection named DataTypes that lists all the supported data types, and provides details about them such as the size, whether they are incrementable, case sensitive, nullable, etc.

 

  • A collection named ReservedWords that lists all the words that are reserved for use by the database and code that runs within it (including in stored procedures and SQL statements).

 

  • A collection named Restrictions that lists the restrictions you can apply to filter the results when querying any of the metadata collections. We look at the use of restrictions in detail later in this article.

 

Most of the remaining collections allow code to enumerate the common objects that make up the contents of the database, such as tables, columns, indexes, stored procedures, etc. Notice that some providers expose a Tables and a Views metadata collection, while others (OLE-DB in particular) just expose a Tables collection. In general, with most of the providers, the Tables collection includes Views that are declared within the database, and these are identified by a value in one of the columns in the DataTable that is returned by the GetSchema method.

 

Some providers also allow you to enumerate the databases on the server, where you are connecting to a database server that can support more than one database (when querying an Access database file, for example, there is obviously only one database available). For SQL Server, this collection is named Databases when queried via the SqlClient provider, and Catalogs when queried via the OleDb provider. However, the Odbc provider for SQL Server does not expose this collection.   

The Database Explorer Example Page

Before we go on to look in more detail at using the GetSchema method, you might like to experiment with an ASP.NET page that we provide in the samples for this article. Named database-explorer.aspx, it allows you to see the list of metadata collections for SQL Server using the three standard providers, and a JET database through two providers. The lists of collections are, in fact, the same as you saw in the earlier screenshots. However, you can select a metadata collection and see the entire contents in a second GridView control on the same page.

 

Figure 5 shows the example page in action, with the list of provider and database combinations at the top. Below this is the name and version of the selected database, which are extracted from the DataSourceInformation metadata collection. This is followed by the list of metadata collections that are available for the selected SQL Server with the SqlClient provider (not all are visible in the screenshot), with the Tables collection selected.

 

Figure 5 - Selecting a Metadata Collection in the Database Explorer Example Page

 

Figure 6 shows the second GridView control in this example page, populated with the contents of the Tables metadata collection (again, only part of the list is visible in the screenshot). However, you can see that the information available includes the database name (the table_catalog value), the name of the schema on which the table is based (in this case they are all owned by the dbo schema, but this is not always the case - especially in SQL Server 2005).  There is, of course, the table name, and also the table type. Notice that, as mentioned earlier, views as well as ordinary (base) tables are included in the collection, and are differentiated by the value in the table_type column.

 

Figure 6 - Part of the Contents of the Tables Metadata Collection for the Northwind Database

 

You will certainly find it useful to browse through the collections on your machine (unfortunately, we are not able to publish this example on our public Web site for obvious security reasons). You'll probably be surprised at the amount of detail that is available. If you don't have a database server on your local machine, you can edit the connection strings in the web.config file to query a remote database elsewhere on your network instead.

Programming with the Schema API

You saw a brief example near the start of this article of using the Schema API to get a list of the available metadata collections. In this section, you'll see in more detail how you can extract specific sets of data from these collections, and use it in your applications. As a simple example to start with, you saw in the previous section how the database explorer page we provide displays the name and version of the database being queried. This is easy to do. Assuming you have already created a connection to the database, you just call the GetSchema method with the parameter "DataSourceInformation" to get a DataTable containing a single row that describes the database engine. Then you can pull the values out of the row, format them into the required string, and display them. The listing below shows the code required:

 

' get the database/server details and display them

Dim dt As DataTable = conn.GetSchema("DataSourceInformation")

output-label.Text = String.Format("{0} (version {1})", _

                    dt.Rows(0)("DataSourceProductName").ToString(), _

                    dt.Rows(0)("DataSourceProductVersion").ToString())

 

Figure 7 shows the resulting output from this code. You can, of course, use the data base explorer example page to obtain the column names (and see what data they contain). It will help you write code that takes advantage of the Schema API.

 

Figure 7 - Displaying the Database Engine Name and Version

Understanding Metadata Collection Restrictions

So far, you've seen how you can extract the contents of a metadata collection by specifying the collection name in a call to the GetSchema method. However, there is an overload of the GetSchema method that takes both a collection name and a restriction array (as mentioned in Table 1 earlier). The restriction array is simply an array of String values that are used to filter the rows in the collection, so that you can effectively select only the information you want.

 

Each collection has a specific number of restrictions (i.e. fields or columns that you can filter on), and you must provide a restriction array of the correct size for the collection you specify when you call GetSchema. The number of restrictions is contained the MetaDataCollections collection, as you can see by referring back to Figures 1 to 4. For example, the number of restrictions for the Tables and Columns collections is 4, while the Databases and Catalogs collections (where available) only accept one restriction. Collections that accept zero restrictions do not allow you to filter the rows that are returned - examples being the MetaDataCollections collection itself, and the DataSourceInformation and DataTypes collections.

 

More details of the restrictions for each collection are available from the Restrictions collection, which should be available for all provider/database combinations. You can query this collection with the GetSchema method to see them all (you can’t apply restrictions to this collection), using:

 

Dim dt As DataTable = conn.GetSchema("Restrictions")

 

Alternatively, you can use the Database Metadata Explorer sample we provide. Figure 8 shows the results for the SqlClient provider against a SQL Server 2005 database. The first column shows the metadata collection name, the second column shows the name of the restriction as used within the database (i.e. the name of the object is filters upon), the third column shows the name that defines the restriction (and is used as the column name of the returned DataTable), and the fourth column shows the order of the restrictions for each collection.

 

Figure 8 - Viewing the Restrictions metadata collection in the Database Explorer example page

Unique Identifier Parts

One other item of information returned by the MetaDataCollections collection is the number of identifier parts that are required to make up a unique restriction on each metadata collection. For example, as you can see in Figures 1 to 4, the Tables metadata collection accepts four restrictions, but only three of these (the value in the NumberOfIdentifierParts column) are required to specify a table uniquely. This is because a table can only be of type BASE TABLE or of type VIEW, and yet you cannot use the same name for a table and a view in the same database. Therefore, the combination of the database/catalog name, the owner/schema name and the table name are sufficient to uniquely identify a table.

 

On the other hand, the Columns collection has four restrictions and four unique identifier parts. As you can see in Figure 8, the first three restrictions are the same as for the Tables collection (i.e. the database/catalog name, the owner/schema name and the table name). But these alone are not sufficient to uniquely identify a column in a table, and so the fourth restriction (the column name) is also required. Hence the number of identifier parts for this collection is four.

 

You can, of course, filter on any of the restrictions, including those that are not part of the set of unique identifiers. For example, when using the Tables metadata collection, you can specify a restriction for the table type - perhaps to get a list of all views or all base tables when you don’t specify a table name restriction.

Creating and Using a Restriction Array

Restrictions are useful when you only want to select a specific item, for example to find details about one table that you know exists in the database. As you can see in Figures 1 to 4, you need to use a restrictions array of size four, and the filters you can apply are the database (catalog) name, the owner (or schema name), the table name, and the table type ("BASE TABLE" or "VIEW"). Assuming you know the four values, you can filter on all of them like this:

 

Dim restrictions(3) As String

restrictions(0) = "Northwind"     ' database/catalog name

restrictions(1) = "dbo"           ' owner/schema name

restrictions(2) = "Orders"        ' table name

restrictions(3) = "BASE TABLE"    ' table type

 

Using conn As New SqlConnection("your-connection-string")

  Try

    conn.Open()

    grid1.DataSource = conn.GetSchema("Tables", restrictions)

    grid1.DataBind()

  Catch ex As Exception

    output.Text = "* ERROR: " & ex.Message

  Finally

    conn.Close()

  End Try

End Using

 

 

You can also use just one, or some, of the restrictions. Any array items that you leave set to Nothing (null in C#) will be ignored. For example, if you want all tables named Orders, irrespective of which owner/schema they belong to, you could use:

 

Dim restrictions(3) As String

restrictions(2) = "Orders"        ' table name

restrictions(3) = "BASE TABLE"    ' table type

 

And if you want to find either a table or a view with a specific name, you can omit the fourth restriction as well:

 

Dim restrictions(3) As String

restrictions(2) = "Orders"        ' table name

Querying the Tables Metadata Collection

As an example of using restrictions, the example page named get-tables.aspx demonstrates selecting in the Tables metadata collection with four restrictions, and then with a single restriction. The code uses a generic routine that accepts a connection string, a metadata collection name, and a restrictions array; and returns a DataTable populated from the specified collection. The routine, a function named MyGetSchemaRoutine, is shown in the next listing:

 

Private Function MyGetSchemaRoutine(sConnString As String, _

  sCollectionName As String, aRestrictArray As String()) As DataTable

 

  ' create a connection to the database

  Using conn As New SqlConnection(sConnString)

    Try

      conn.Open()

 

      ' return DataTable with all metadata about Tables that

      ' match the restrictions specified in the array

      Return conn.GetSchema(sCollectionName, aRestrictArray)

 

    Catch ex As Exception

      output.Text = "* ERROR: " & ex.Message

      Return Nothing

    Finally

      conn.Close()

    End Try

  End Using

 

End Function

 

The code in the Page_Load event handler can then call this routine to get a DataSet based on the values passed to it as parameters. In the next listing, you can see that - after fetching the connection string for the web.config file - the code creates a restrictions array that specifies one table. Of course, not all of these restrictions are actually required to select a single table, as the connection itself defines the database that the information will be extracted from, and (as you saw earlier) only the first three restrictions are required to uniquely identify a table. Once the restrictions array is populated, the code calls the MyGetSchemaRoutine function and then displays the results in the first GridView control declared in the HTML section of the page:

 

Sub Page_Load()

 

  ' read connection string from web.config

  Dim connect As String = ConfigurationSettings.ConnectionStrings( _

                          "nwind-sql").ConnectionString

 

  ' create a restrictions array to select specific rows

  ' Tables collection has 4 restrictions:

  ' table_catalog, table_schema, table_name, table_type

  ' null values indicate no restriction

  Dim restrictions(3) As String

  restrictions(0) = "Northwind"

  restrictions(1) = "dbo"

  restrictions(2) = "Orders"

  restrictions(3) = "BASE TABLE"

 

  ' call routine to get DataTable of matching metadata rows

  Dim table As DataTable = MyGetSchemaRoutine(connect, "Tables", restrictions)

 

  ' bind DataTable to first GridView control to display rows

  grid1.DataSource = table

  grid1.DataBind()

  ...

 

The code continues by clearing the restrictions array (setting all the items in it to null) and specifies just the table type restriction. Then it calls the MyGetSchemaRoutine function again, and displays the results in the second GridView control declared in the HTML section of the page:

 

  ...

  ' clear restrictions array

  Array.Clear(restrictions, 0, restrictions.Length)

 

  ' specify new restriction to select:

  ' table_type="VIEW" (i.e. exclude BASE TABLES)

  restrictions(3) = "VIEW"

 

  ' call routine to get DataTable of matching metadata rows

  table = MyGetSchemaRoutine(connect, "Tables", restrictions)

 

  ' bind DataTable to second GridView control to display rows

  grid2.DataSource = table

  grid2.DataBind()

 

End Sub

 

The results of running this example are shown in Figure 9. So you can see how easy it is to get lists of tables from your database using the Schema API. And if all you want to do is check if a table exists, you just specify enough restrictions to uniquely select it in the Tables metadata collection, and then query the Count property of the Rows collection of the DataTable to see if it is equal to one (or greater than zero):

 

Dim restrictions(3) As String

restrictions(0) = "Northwind"

restrictions(1) = "dbo"

restrictions(2) = "Orders"

Dim dt As DataTable = conn.GetSchema("Tables")

If dt.Rows.Count = 1 Then

  ' table exists in database

Else

  ' table does not exist in database

End If

 

Figure 9 - Using restrictions with the Tables metadata collection

Querying the Columns Metadata Collection

As a final example of using the Schema API, the example page named get-column-details.aspx shows a common scenario that you may come across when building data management tools. If you want to know all about a specific table in a database, and in particular details of the columns in the table, you can use the Columns metadata collection. This example queries the Columns metadata collection for columns in the Orders table of the Northwind database, and displays the details in an attractive and readable format.

 

Obtaining the information from the Columns metadata collection is easy enough - you just populate restriction entries that specify the table, but not the column name. In other words, you just require:

 

Dim restrictions(3) As String

restrictions(0) = "Northwind"

restrictions(1) = "dbo"

restrictions(2) = "Orders"

 

Then you call the GetSchema method for the Columns metadata collection:

 

Dim dt As DataTable = conn.GetSchema("Columns", restrictions)

 

The awkward part is converting the contents of the DataTable into an attractive display. We chose to create an HTML table dynamically from the DataTable, adding style attributes to give the appearance shown in Figure 10 (not all columns from the original Orders table in the database are visible in this screenshot). The code to generate the HTML table is not listed here, as it is not really the focus of this article, but you can download the samples from our Web site at http://www.daveandal.net/articles/schema-api/ to view all the code.

 

Figure 10 - Extracting and displaying table column information using the Schema API

 

 

The main point of this example is to give you a chance to explore the details that are available from the Columns metadata collection. You can see that the columns are ordered alphabetically in the DataTable containing the metadata, but their ordinal position is contained in the metadata. Also visible is the default value, nullability, data type, the maximum length for character-based columns, the precision, scale and size for numeric columns, and the character set used - amongst other useful items of information.

 

Alex Homer (alex@stonebroom.com)      ©2005 Stonebroom Limited, England