Developer

The X-factor: Supporting XML technologies

TechRepublic columnist Tim Landgrave examines three key XML standards that CIOs need to understand in order to lead their enterprises from the Information Age into the Integration Age.


As XML becomes prevalent, there are corresponding XML-based standards that have already begun shaping the way enterprises view their next-generation application architecture. For most CIOs, it’s sufficient to be conversant in new XML technologies but not necessary to be proficient in implementation.

In this column, I’ll examine three of the key XML standards that CIOs need to understand in order to lead their enterprises from the Information Age into the Integration Age: XML schemas, the Extensible Stylesheet Language (XSL), and XML Query.

Solving the data integration problem
In order for one system to use data provided by another, the receiving system must have some way to both accept the incoming data and break the data down into the appropriate fields and field types.

Many developers grew up writing systems that just took output from one COBOL batch job and fed it into another system. In those scenarios, the developer would get a file electronically or on tape, along with a specification document that described the layout of the records in the incoming files. It was then the developer’s responsibility to parse the file and manipulate the data to fit the receiving system’s field standards. For example, the incoming file might have a three-digit packed decimal number that resolves to a text string of “010190” representing the date "January 1, 1990" that the developer must convert into the receiving system’s internal date format. Without the file layouts that described the function of those three positions, the developer would be hard pressed to guess the values in those positions.

Of course, this isn’t a problem when two programs are on the same system using the same underlying data store. But in the “connected economy,” these situations are becoming less frequent each day. It’s also incredibly inefficient to continue developing “translation programs” for each new system that we want to incorporate into our information backbone. This is part of the problem that XML is intended to solve. By representing all data as text, and including both the structure and the content of a data set in a single file, it has become easier for two systems to exchange data.

But passing data around as XML strings puts the onus on the receiving party to validate the contents of the XML and verify that the “valid” contents are also “expected.” For example, a purchase order represented as XML may be "valid" XML but is of little use if you were expecting an invoice.

Catch up on XML
When XML was just emerging as a key integration technology, I wrote an article series designed to give CIOs the necessary information to be fluent in XML-speak. Click here to read the final article in that series and get links to the previous installments.

Enter XML Schema Definition Language
The XML Schema Definition Language (XSDL) became a formal W3C recommendation last May. The specification outlines a method to define both the structure of an XML document and the valid data types allowed within a document. By providing not only the XML document but also a valid XSDL schema, a receiving system can process the incoming data set more efficiently.

In simple terms, think of XSDL as a universal Type System. Rather than dealing with different definitions of integers, dates, times, and strings from different OSs and machine architectures, XSDL defines a common set of types to which all other systems can map.

By agreeing on a common set, developers are free to combine these into structures that represent real-world objects. The types themselves can be further defined to let receiving systems validate data before attempting to process it. These additional definitions include the following:
  • Restriction: Defining the length of a string to be more or less than x characters
  • Pattern: Stipulating an expression that values must follow (for ZIP+4 or phone numbers)
  • Enumeration: Restricting the value of an element to a fixed number of options

Once data elements are defined, they can be placed in a structure. For example, suppose you’ve defined the inventory structure, the line item structure, and the order header structure for a typical order. Now you have to define how each of these relates to the other within a given XML document. In the case of our sample order, the order header may occur any number of times but must be followed by at least one line item. Line items may occur any number of times but may contain only one inventory item, quantity, and cost element.

It's getting easier
Until now, the process of creating XSDL schemas and matching XML documents has been a tedious, manual process. But every major application developer tools vendor now has tools that allow developers to create apps that generate the mappings between that platform’s type system and the type system defined in the XSDL standard.

These tools make it much easier to create applications that can pass data to and receive XSDL-defined data from other systems. By developing new systems that create data based on XSDL schemas, you can guarantee that systems consuming or receiving the data will be able to process it.

How to get ready
As you begin planning any new system, you should make the mapping of that system's data to standard XSDL schemas a requirement of the systems analysis phase.

If you’re in a business that requires extensive interaction at a systems level with other companies, then you may decide to begin creating definitions for the data that your organization sends between systems or outside the firewall even before new systems are developed. You can use these same schemas not only for new application development but also to allow existing XML Messaging engines like WebMethod, TibCo, and BizTalk to pass data between existing legacy systems.

Editor's Picks