Web Development

Turn XML into relational data with OPENXML

XML support has been significantly extended for SQL Server 2005. In this article, Tim Chapman takes a look at how you can shred XML data into a relational format with SQL Server's OPENXML function.

Starting with SQL Server 2000, Microsoft began to provide support for XML data. This support has been significantly extended for SQL Server 2005, allowing for XML data columns, XML variables, and XML indexes.

While storing XML data in the database is a terrific feature, the ability to format the XML data as relational data is essential for a large majority of data processing needs. This is where OPENXML enters the equation. OPENXML is a SQL Server function, which accepts a stream of XML data and provides an in-memory relational rowset view of the XML data.

An OPENXML example

This example shows how you can take advantage of the OPENXML features. Let's assume that we are working with a shopping cart application. In this scenario, customers can log into the Web site and purchase a variety of different products. Due to the company's business model, customers can save more money by purchasing more via the Web site. We'll say that most customers purchase seven or more products. Our goal is to minimize database calls when possible because the site is very busy. The plan is to combine purchases into an XML document at the Web interface and pass that XML data as a character string into a stored procedure at the database level. From there, we will insert the orders from the XML data in one database call.

Get SQL tips in your inbox
TechRepublic's SQL Server newsletter, delivered each Tuesday, contains hands-on tips that will help you become more adept with this powerful relational database management system.
Automatically sign up today!

To implement this solution, we will take a look at the XML data that will be compiled at the Web interface. This simple XML document houses the core information obtained from the shopping cart Web app: the product, the price, the date, and the customer information. Listing A contains a sample XML data stream.

Now, we need to design the stored procedure that the Web interface can call to pass the XML string to the database. Listing B contains the text for this stored procedure.

This procedure accepts an XML data type (which is new to SQL Server 2005) as an input parameter to the stored procedure. (We could use a variable character data type for this input parameter, such as VARCHAR(max) or a declared VARCHAR data type.)

We then must call the system stored procedure sp_xml_preparedocument, which creates an in-memory representation of the xml document, which it also accepts as a parameter. Once we have the memory pointer to the XML variable, we can call OPENXML, which you may do with several different parameters that give you very granular control over the result set of the XML data.

We are providing the pointer to the XML document, along with an XPATH query that identifies the nodes in the XML data we wish to return. The WITH clause of the OPENXML function allows you to specify the rowset format that you wish to be returned. Once we insert into our WebSales table, we call the sp_xml_removedocument system stored procedure to clean our XML data from SQL Server's memory.

This simple example shows the potential power of using OPENXML to save database calls. It gives you the flexibility of inserting an entire XML document into a SQL Server table, or allows you to shred the XML file and place it into several different SQL Server tables. Using this technique, we inserted eight records into the database with one procedure call, rather than the eight separate inserts with the traditional technique.

OPENXML limitations

OPENXML is very memory intensive. The pointer returned by the system stored procedure sp_xml_preparedocument is a memory pointer to a COM XML document object model. So, you must be careful not to load large XML documents into memory with OPENXML because it may overload your server's memory.

In a future article, I will show how you can use some new XML features of SQL Server 2005 to accomplish the same functionality of OPENXML without the memory intensive operations of OPENXML.

Tim Chapman is a SQL Server database administrator who works for a bank in Louisville, KY, and has more than 7 years of IT experience. He is also Microsoft certified in SQL Server 2000 and SQL Server 2005. If you would like to contact Tim, please e-mail him at chapman.tim@gmail.com.

About

Tim Chapman is a SQL Server MVP, a database architect, and an administrator who works as an independent consultant in Raleigh, NC, and has more than nine years of IT experience.

5 comments
craig
craig

I have taken a look at this. Is an unusual XML format you have used (in my opinion) you have used Attributes over Nodes for values. I can't get either .query or .node to work at all, throwing error XQuery [query()]: Syntax error near '', expected a "node test". Strange? Anyways, will keep looking as I am sure after this hurdle, it will be a great way to import large amounts of data in RMDBs.

Tony Hopkinson
Tony Hopkinson

Gah, ugh, fooghy Stop messing about what do you want a relational database or markup, pick one ! Any solution that tries to be both is going to be arse.

prachi.rani
prachi.rani

I want to insert data into XML datatype and want to store variable in XML format instead of hardcode.

jmgarvin
jmgarvin

What's wrong with file listeners and XSLTs? ;-p