By Brian Schaffner

For some time, the XML Query working group of the Worldwide Web Consortium (W3C) has been developing a robust and consistent protocol for querying XML data. One of the products they’ve developed—the XQuery language—is aimed at providing the capability to query XML documents and repositories via the Web. Let’s run through some of the features of XQuery.

The use of the Web as a communication medium is nowhere more evident than in current business practices. Companies are pushing their businesses to the Web because it allows for virtually instantaneous exchange of business data. One of the formats used to carry this information payload is XML, which can universally describe data and allows for the data structure to change with limited impact on the systems that are trading data.

Although companies are already sharing data as XML documents, they have been lacking a formal and standard approach for requesting XML data. While some W3C working groups have created technologies for extracting data from XML documents, such as the XPath language developed by the XSL working group, the XML Query group is dedicated to a broader picture—one where users can query XML repositories on the Web and retrieve results as XML documents.

A little history
XQuery is based on some existing and some outdated technologies. It is a direct descendant of a query language called Quilt. A small team patched together several technologies—including XPath, XML-QL, SQL, OQL, Lorel, and YATL—to create the Quilt language. Some of these technologies are closely tied to XML, while others are associated more closely with relational databases. This reflects the goal of XQuery to be a language for querying XML repositories as opposed to querying a single document (as with XPath).

The language
XQuery queries are made up of expressions based on XML Query Algebra (a product of the XML Query working group). XML Query Algebra provides a mathematical language for describing the semantics and relationships of the laws for XML queries. This algebra provides the dictionary necessary for forming meaningful XQuery expressions, which in turn are the main vehicle used to extract results from an XML repository or database.

The XQuery language uses the XML Query Data Model (yet another product of the XML Query working group) to facilitate query expressions and resulting data sets. The Query Data Model provides the definition of XML data sets in terms of XML Schemas. Essentially, it allows you to define multiple XML documents or partial XML documents as a single set of XML data. Both the query expression and the results of the XQuery can be described using the Query Data Model. This facilitates the extraction of multiple documents and fractional documents from an XML repository as a single result set.

XQuery provides a rich set of expressions for accessing XML data. For simple document queries, you can use a Path expression, which is similar to and based on the XPath language. Another type of expression is the Element Constructor, which allows you to dynamically construct new XML elements. The FOR-LET-WHERE-RETURN (FLWR) expression adds the capability for creating logic routines that include iterators and variable binding. This is similar in function to a SQL stored procedure. Conditional expressions add IF-THEN-ELSE logic, and Quantifiers allow you to specify conditions that must be satisfied by the result set. Finally, Filter expressions provide an interface for extracting certain document nodes based on node identity rather than data values.

The XQuery language is designed to provide a robust query interface to Web-enabled XML repositories. XQuery is based on several other query languages and is developed by the XML Query Working Group. The language provides result sets of multiple XML documents or partial documents based on algebraic expressions.