Enterprise Software

A SOAP syntax breaker

XML is the correct syntax for Simple Object Access Protocol (SOAP) messages. Learn how to define the tags that add structure to the data in your SOAP messages.


It’s possible to explain Simple Object Access Protocol (SOAP) communication principles using existing Web technologies like JavaScript, HTML forms, and Perl CGI. The obvious drawback to this approach is that SOAP-like messages don’t have the proper SOAP syntax—it’s a nonstandard solution. XML syntax is the correct syntax for SOAP messages.

SOAP is XML and HTTP
SOAP messages are just plain old XML files. XML provides the syntax that lets you use different varieties of markup. Instead of an <html> tag, the SOAP specification describes an <envelope> tag. In order to make a SOAP request, three sets of tags are required.

Once the tags are assembled, combine them into a document. This document is the SOAP message. Finally, the message must be sent somewhere. Almost certainly you’ll want to send not one but two messages: a request-response pair. One is a ”please work on this data” message, and the other is a “here’s what I did to it” message.

The SOAP standard uses HTTP for the request-response message pair—sending SOAP data is no different than loading a Web page or submitting a form. Later I’ll examine how using HTTP affects the content sent over the Internet between Web browser (or request client) and Web server (or response server). First, I’ll take a closer look at a SOAP transaction.

The sample SOAP transaction
In an earlier SOAP icebreaker article, three pieces of information were sent to the server: name, age, and hair color. One piece of information was returned: a description of the person in the form of a sentence of text. Let’s take a look at an example:
  • ·        A person: name (John Doe), age (21), and hair color (Brown).
  • ·        A sentence description: John Doe is a young brunette.

Easy. Now write both down again in simple XML and call these send.xml and recv.xml:
send.xml:
<person>
<name>John Doe</name>
<age>21</age>
<color>Brown</color>
</person>
 
recv.xml:
<description>
John Doe is a young brunette
</description>

Except for some careful revision of the syntax, both SOAP messages are finished. It’s really only the buckets of official syntax that disguise the simplicity of SOAP. You need to tidy up (read: put all the junk in) these SOAP messages.

Polishing request and response messages
The main problem with the initially constructed messages is that all of the syntax comes out of thin air. There’s no hint that the messages are XML, and there’s certainly no hint about what the various tags infer. Improve the send.xml message in this way:
<?xml version="1.0" ?>
<p:person xmlns:p="http://saturn.test.com.au/2002/person">
<p:name>John Doe</p:name>
<p:age>21</p:age>
<p:color>Brown</p:color>
</p:person>

Using XML namespaces (xmlns), I’ve identified all the tags as belonging to some “p” collection of tags called “http://saturn.test.com.au/2002/person.” If the sender and the receiver of the message both understand the collection, the message is meaningful. But where does that collection of tags originate?

That collection of tags is specified in another XML file. That file is full of tags from the XSchema standard. The XSchema standard is a way of describing new tags, such as <person>. Since I invented the <person> tag, I have an Xschema document defining what that <person> tag looks like. It’s confusing, but all XSchema documents end (by convention) with .xsd, not .xml, and the .xsd extension is left off when the schema is referred to. So htttp://saturn.test.com.au/2002/person is a person.xsd file, which is just an XML file. It looks like Listing A.

The programmer creates this XML file. It refers to another XML file, the one at “http://www.w3.org/2001/XMLSchema.” It’s written for you by the standards body, so there’s nothing to do except use it every time you make a schema file like person.xsd.

XSchema defines the second set of tags needed. The <element> tag is the primary one. This person.xsd file uses two <element> tags to define the <person> and <description> tags in my example. There’s quite a bit of detail in this schema file, but what’s important to note here is that all message tags are defined, and all the tags have types. For example, the <age> tag is of type=”positiveInteger”, and it can only appear inside a <person> tag. Data types are important because SOAP messages usually send data, not free text. Even the simple response message (<description>), which could be free text, is instead nominated as a string.

Put the requests into a SOAP envelope
Creating a SOAP message is a bit like writing and mailing a letter. Once you’ve created a SOAP message, put it in a SOAP envelope and write instructions on the front. Here’s what a SOAP envelope looks like:
<Envelope>
<Header> ... </Header>
<Body> ... </Body>
</Envelope>

The ellipses (three dots) indicate where the letter and other content goes. Of course, this piece of XML is as lazy as the first send.xml, so clean the syntax up a bit:
<?xml version="1.0" ?>
<env:Envelope xmlns:env='http://www.w3.org/2001/12/soap-envelope'>
<env:Header>
</env:Header>
<env:Body>
</env:Body>
</env:Envelope>

This is the third set of tags you’ll need. The <Envelope>, <Header>, and <Body> tags (case-sensitive!) are all specified for you by the SOAP standard—they describe the pieces of the envelope that surrounds the SOAP data, even though the envelope is technically separate from the data you’re actually trying to send. The whole thing is called a “SOAP message.” Now let’s put the message in the envelope, as shown in Listing B.

Three “xmlns” XML namespace declarations for the three collections of tags are used in the message: two visible above (one is repeated), and one in the person.xsd file. But what’s the <p:control> tag? As well as putting the data into the envelope, you can take advantage of the SOAP header features. Attaching SOAP’s “mustUnderstand” attribute to the <p:control> tag indicates to the receiver of the message that the message must be intelligently processed or else fail totally. The <control> tag is a new tag created just to carry that attribute. Add it to the person.xsd schema, perhaps like this:
<schema:element name="control">
<schema:complexType>
</schema:complexType>
</schema:element>

There’s an XSchema trick here that stops the <control> tag from carrying any content—it resembles an <hr> tag in HTML. Now the SOAP message is complete. If you do all that for the other message, it’ll look like Listing C.

All that remains is to send these two messages.

Binding the SOAP messages to HTTP
HTTP is responsible for sending SOAP messages back and forth. It’s the equivalent of a postman carrying the SOAP envelope in his hand to the destination. If you want to send a SOAP message from some programming language, like Perl, Java, or C++, then no browser is involved. You need to know what HTTP headers to use for the request. Here’s what you need for the final send.xml:
POST /transactions/AnalysePerson HTTP/1.1
Host: jupiter.test.com.au
Content-Type: application/soap; charset="utf-8"
SOAPAction: "http://saturn.test.com.au/transactions/AnalysePerson"
Content-Length: 447

The SOAPAction header is optional and sort of a cheat. It tells the receiving system how to handle the incoming message without requiring the receiver to look inside and deduce the content is SOAP content. If a SOAP server is ready to use, you just need to determine if the server adds anything when this header is present. Of course, the headers for the recv.xml SOAP message will start:
HTTP/1.1 200 OK
Content-Type: application/soap; charset="utf-8"
Content-Length: 406

This is because the recv.xml message is returned inside a HTTP Response that matches the POST HTTP Request.

SOAP messages can be composed and sent by modern browsers, so you might expect the list of HTTP headers to be a little larger. From a browser, it doesn’t matter what all the headers are, but for completeness, a genuine, fully compliant SOAP request from a client might look like Listing D.

In the end, the same three pieces of information are sent (name, age, color), but a lot of extra junk is passed with it. That’s the price of the flexible, character-based HTTP and XML standards.

Seal the envelope
SOAP is a system for passing messages between programs, and the syntax is very straightforward. As long as you remember that the central task is just to define the tags that add structure to the data you send, all the rest is just a load of tasteful Christmas-tree decoration. Don’t be fazed by it. If your ambition leads you to work with SOAP extensively, the best thing you can do is study the SOAP and XSchema standards.

 

Editor's Picks

Free Newsletters, In your Inbox