A new approach to locating XML data with XQuery

Finding specific information contained in an XML document can be a challenge. However, XQuery can save you valuable time when searching XML documents. Review the essentials of this W3C specification in this introduction.

By Tony Patton

The rapid adoption of XML throughout the industry has led to an abundance of XML-formatted data. XSLT is the popular method for transforming XML to a required format, but locating data within an XML document is a different story. XPath was developed to easily retrieve items from an XML document, but it requires knowledge of the XML document structure. This fundamental need to locate XML-based information resulted in the development of XQuery (XML Query).

Basically, XQuery is a standard query language developed by the Web Consortium. It is analogous to SQL and its relationship with the backend database, but XQuery is not restricted to XML-based data. XQuery is flexible enough to query a broad spectrum of data sources, including relational databases, XML documents, Web services, packaged applications, and legacy systems. This article provides a quick introduction of the essential aspects of XQuery.

Express yourself
The main aspect of XQuery is that everything is an expression. XQuery is not a programming language, so XQuery scripts (or programs) are expressions. This is where the SQL analogy is appropriate, because SQL statements are basically expressions that interact with backend data, although the expressions may become very complex. Here is a simple example of an XQuery expression:
let $value1 := 0
let $value2 := 1
let $rValue := ""
if ($value1 > $value2) then let $rValue := "true" else let $rValue := "not true"
return $rValue

The preceding few lines represent a simple XQuery expression. It creates and assigns values to variables, utilizes flow control via the if statement, and outputs a value via the return keyword. In this example, let is used to assign values and the dollar sign is prepended to variable names. In addition, the assignment operator is a colon plus the equal sign. The if structure follows the basic syntax of most languages. The return statement marks the point in the expression where a value is returned. The return value may be a simple variable (like the previous example), static text, or a mixture of extracted values and text.

Basic elements
While the central ingredient of XQuery is the expression, these expressions utilize the following common reserved keywords:
  • for—Process (loop) items within an XML document
  • let—Create and assign variable values
  • where—Conditional statement used in conjunction with the for keyword
  • return—The values returned to the expression originator

An overused acronym used for these common keywords is FLWR, often called FLWR-expression. Here is a basic XML document, which contains a sampling of books:
<?xml version="1.0" encoding="ISO-8859-1"?>
<book type="paperback">
<title>American Psycho</title>
<author>Bret Easton Ellis</author>+
<book type="hardback">
<title>A Burnt-Out Case</title>
<author>Graham Greene</author>
<book type="paperback">
<title>The Information</title>
<author>Martin Amis</author>

The preceding XML is used in the following XQuery example:
let $doc = document("books.xml")
for $d in $doc/books/book

In this simple example, all books in the format of title followed by the text and the author's name are returned. Notice that XPath notation is used to specify individual nodes in the for statement and in portions of the return statement. Two other noteworthy aspects of this example:
  • The document is a standard XQuery function. It is used to access an XML document or node as an element within the expression. In the previous example, it was assigned to a variable and later processed with XPath expressions.
  • The values are utilized in the return portion of the expression by enclosing them within curly braces with the appropriate XPath syntax. The current element is accessed using the variable name declared in the for statement.

There are three books in the sample XML, but only two are paperbacks (the attribute of the book element). This attribute may be utilized in a where statement that extends the previous example to output only paperback books:
let $doc = document("books.xml")
for $d in $doc/books/book
where ($d/@type = "paperback")

The where clause guarantees only paperback items are returned. Again, this is similar to the structure of a SQL statement where a condition is used. These examples provide a peek at the XQuery approach and syntax. It is a powerful language for retrieving necessary data, and industry support is swelling.

Where can I get it?
As with most developing technologies, you must search carefully to locate products that support it. Thankfully, XQuery is seen as an industry standard, so the rush to support it has been quick. You can find it in popular tools such as XML Spy and Corel XMetal. In addition, Microsoft has been quick to provide XQuery support for .NET, and the Java community has been on the bandwagon for some time. The support is overwhelming, so the prospect of XQuery becoming deprecated is dismal (I never say never).

Only the beginning
The comparison between XQuery and SQL is good, but notable differences are apparent. A major variation is XQuery's lack of support for updating the data source (this is a basic aspect of SQL), nor can data sources be created on the fly. With this limitation in mind, some vendors have chosen to develop a proprietary approach to updating. This is just one item, but it does show the difference and highlights the fact that this is a relatively new technology (version 1.0) that will continue to evolve.

Editor's Picks