XPath provides an easier way to select information in Jet database

In the Microsoft .NET Framework, extracting information from a Jet database can be accomplished using SQL queries or XPath. Each method has pros and cons, but you may find yourself using XPath more often as .NET becomes more prevalent.

If you're a power user or developer familiar with Microsoft Access (or other products that use the Microsoft Jet database engine), you probably think of the SQL language when thinking about queries. In a traditional Access application, SQL is the language that you write queries in, and knowing SQL is necessary to write efficient Jet code.

But with the advent of the Microsoft .NET Framework, there's a second way to extract information from a Jet database: the XML Path Language (XPath). XPath is a language for extracting information from XML files, so you might wonder what it has to do with Jet queries. The answer is that the .NET Framework implements a deep-seated equivalence between relational data and XML data. In some cases, XPath will provide an easier way than SQL to describe the data you want.

An XPath example
Let's work through some sample code to execute an XPath query on data extracted from a Jet database. Some prior knowledge of ADO.NET will help here, but you should be able to see what's going on even if you're not an ADO.NET expert. Figure A shows the simple form that I put together for this article.

Figure A
You can use XPath to query Jet information.

This application loads up three tables from the Northwind sample database: Employees, Orders, and Order Details. It then lets you enter an XPath expression. When you click the Go button, it returns the data specified by the XPath expression. In this example, that's the last names of employees who have taken orders for product number 27.

The code in Listing A starts by setting up the ADO.NET objects that it will use: an OleDbConnection object, a DataSet object, and three OleDbDataAdapter objects. The OleDbConnection object provides the connection to the database. The DataSet acts like an in-memory relational database; it can hold data from multiple tables and the relations between those tables. Each OleDbDataAdapter specifies a particular set of data to load from the underlying database.

The next step is to actually retrieve the data. The Fill method of the OleDbDataAdapter object takes the specified data and places it in a named DataTable within the DataSet. Each of the three OleDbDataAdapter objects fills one DataTable, like so:
daEmployees.Fill(ds, "Employees")
daOrders.Fill(ds, "Orders")
daOrderDetails.Fill(ds, "OrderDetails")

At this point, the data is in the DataSet, but it's in three unrelated tables. To specify the relation between the tables (i.e., one employee has many orders, and one order has many order details), the code in Listing B creates a pair of DataRelation objects and adds them to the DataSet's Relations collection.

So far, the code has been entirely working in the relational database realm of ADO.NET. Now comes the switch to XML. The .NET Framework defines another class, the XmlDataDocument, which is simultaneously an XML document and a DataSet. By creating an XmlDataDocument from the DataSet, you can get an XML view of your relational data:
' Retrieve the equivalent XML document
Dim xd As XmlDataDocument = New XmlDataDocument(ds)

From here on, it's pure XML. First, I used the SelectNodes method to apply the XPath expression from the user interface to the data loaded in the XmlDataDocument:
' Run the XPath query
Dim nl As XmlNodeList = _

Finally, I can iterate through the XmlNodeList to show the results of the XPath query on the user interface:
' And display the results on the form
Dim xn As XmlNode
For Each xn In nl

Some XPath expressions
To use this technique, you need to have at least some familiarity with XPath. You can read the full XPath specification, but you may find it rather heavy going. To get you started, I'll dissect some simple XPath expressions here. Let's start with the one shown in Figure A:

This expression has two parts. First, there is a specification of what to return; second, there is a filter (in square brackets) that limits the returned data. The first expression provides a path down through the XML file to the data to be returned. Listing C shows a small portion of the XML equivalent of the data in the DataSet.

The ellipses, of course, show where I've omitted much additional data. The expression /OrderData/Employees/LastName says to locate LastName elements that are children of Employees elements that are children of OrderData elements that are at the root level of the XML file. As you can see, this provides a path (quite similar to a disk file path) to the desired elements—hence the name XPath.

By itself, this expression would return all of the LastName elements from the entire XML file. The second part of the expression, [../Orders/OrderDetails/ProductID=27], filters the expression to return only particular LastName elements. This expression says to start at the parent of the LastName element (specified by the double dot), and then to proceed down through Orders, OrderDetails, and ProductID elements. It then looks for ProductID elements whose value is 27. The LastName (from the first part of the expression) is returned only when the ProductID of 27 is found.

XPath also supports a selection of filtering functions, as in this expression:

An English translation of this might read: "Return the last names of all employees whose last name starts with the letter D." You can also simplify this expression somewhat:

Here, the single dot inside the starts-with function refers to the current node—that is, to the LastName under consideration.

Other filtering functions include contains (which determines whether one string contains another), last (which returns the last element in a collection), string-length (which returns the number of characters in a string), and substring (which returns a substring from a string).

XPath also allows you to specify some queries that are very difficult to write using SQL. For instance, you can use indexing to return particular elements by position, like this:

This code returns the first OrderDetails element from each order in the entire file—a concept that is hard to express using the set-oriented approach of SQL. You can also index as you're specifying the path, like so:

This expression returns the ProductID from the first OrderDetail in the first order taken by the second employee in the file.

There's much more to XPath than I can show you in a short article. If you haven't worked with XPath before, you should at least browse through the specification to get a sense of what's possible.

Choosing the right alternative
So, which is the right choice: SQL or XPath? In some cases, the limits of the respective technologies will dictate which you should use. For instance, SQL can't handle the indexing capabilities of XPath, but XPath can't handle the grouping and totaling capabilities of SQL. In other cases, you can make a choice based on which seems more natural to you; many XPath expressions can be represented as SQL WHERE clauses, and vice versa.

If you're moving into the .NET universe, though, it's clear that you should get used to XPath as a way to select data, because there is no general way to execute SQL statements across multiple tables in a DataSet. You can use complex SQL to fill the DataSet in the first place, but once the data has made it into the DataSet, XPath is a much easier way to select just the information you want to work with. Once you get over the learning curve, it's likely that you'll find lots of ways to save time with XPath.

Editor's Picks

Free Newsletters, In your Inbox