Overcoming decades old legacy systems with XSLT and Muenchian grouping

Implementing B2B Web services sometimes involves interaction with entrenched legacy systems. One method application developers can use in these situations takes advantage of XSLT and Muenchian grouping.

Recently I needed to create a Web service that would allow an outside organization to enter orders on a legacy system using XML. My initial optimism about this project was squashed by the reality of dealing with the limitations of a system that was designed and coded sometime during the Carter Administration. While I had anticipated that the XML would need to be massaged in order to produce something that my mainframe counterpart could use, I hadn't foreseen exactly what constituted an order on both the outside organization and the legacy application.

Muenchian grouping

Most of the major philosophical differences in what constitutes an order stemmed from the fact that the outside organization stored customer information on the line level while the legacy application stored customer information on the order level. This means that every line on the inbound order could conceivably be a different customer, a concept that was not contemplated during the Carter years. Another issue which only served to compound this problem is that the legacy system only supported a single ship to address, a single bill to address and a single carrier per order. What had started out as a relatively simple Web service quickly began to snow ball into both a major project and a major headache.

After a brief delusional moment where some kind of convoluted C# program was considered for splitting an inbound order into something the legacy system could handle, I decided to try a more elegant approach, XSLT and Muenchian grouping.

Downloadable version

A downloadable version of this article is available in PDF form from the TechRepublic Download Center. The PDF version includes the article and all of the code listings in a printer-friendly format.

Developed by, and named after, Steve Muench of Oracle, the Muenchian grouping method uses the xsl:key element and the key() and generate-id() functions to determine keys that are used to retrieve nodes. To illustrate how this works let's start with the XML document shown in Listing A and, for the sake of simplicity, group upon only the customer node. With Muenchian grouping we would create XSLT that looks something like what is shown in Listing B.

Examine the components

At first glance Muenchian grouping doesn't appear to make much sense, how can an xsl:key element along with a couple of functions group the nodes in an XML document? In order for the inner workings to become clear it is necessary to examine the individual components, only then will the simple elegance become clear.

The xsl:key element is a top-level XSLT element that is used to declare a named key. The element’s match attribute defines the node returned from the key() function while the use attribute defines the key itself. The purpose of the predicate, [1], is to insure that only the first instance of each key is returned. This means that each key is used only once instead of the number of times that the key occurs. Without this predicate, if a particular key occurred twice the associated elements would be doubled, three times tripled, and so forth.

The generate-id() function is the final piece, it returns a string that uniquely identifies a node, regardless of how the node is obtained. So, if generate-id(.) is equal to generate-id(key('keyGroup',customer)[1]) is true then both nodes are the same node, just obtained differently. It is also important to remember that while the string is always unique, there is no guarantee that id will be equal from execution to execution.

Multiple criteria

Occasionally, when grouping, it is necessary to group information on more than one criteria. Consider the XML document from Listing A. Wouldn’t it make sense to, in addition to grouping on customer, group on carrier, ship to address, and bill to address? Seems easy enough, just add another xsl:key element and another key() function, right? Wrong!

When using Muenchian grouping only one xsl:key element and key() function is ever needed. Multiple grouping criteria is handled through the addition of the concat() function which is used to concatenate the various nodes and/or elements to produce a single key. With this in mind, Listing C shows the XSLT that performs grouping on the customer, carrier, ship to address and bill to address.

How far can Muenchian grouping using concatenated be taken? Personally I’ve used this technique to group information based upon eleven different criteria. The resulting XSLT was, when printed not quite three pages long, which while impressive for XSLT, was thousands of lines shorter than programs written in traditional programming languages.

An adaptable solution

While this may, at first, seem like a rather specific application it is one of those tools that can be readily adapted to other uses. I recommend playing with the examples provided here in order to become comfortable with this technique. In addition, you may want to purchase a copy of XMLSPY from Altova, which was used to produce and test the examples provided here. Not only is it a slick professional XML editor, it allows the developer to step-through an XSLT element by element, which comes in handy when debugging Muenchian grouping.