It’s virtually impossible to pick up an industry magazine and not hear about the glories of XML (Extensible Markup Language) and SOAP (Simple Object Access Protocol). Unfortunately, most of these same publications don’t take the time to put SOAP and its predecessors in the distributed application development space in the proper context.
Over the next four weeks, we’ll take an in-depth look at what SOAP is, why we need it, and the toolsets and opportunities SOAP affords. In this article, we’ll review the advantages and disadvantages of the current options for developing distributed applications: CORBA (Common Object Request Broker Architecture), DCOM (Distributed Component Object Model), and Java RMI (Remote Method Invocation). I’ll compare SOAP-based systems to systems developed using these current architectures, and I’ll discuss the features that make SOAP an ideal transport as well as the limitations of implementing new applications with SOAP.
Common distributed application development models
CORBA is an industry-standard application development model drafted and supported by members of the Object Management Group. CORBA defines its own generic wire protocol, called the General Inter-Orb Protocol (GIOP), with specific versions mapped to underlying transport protocols. (Internet Inter-Orb Protocol, or IIOP, is the TCP/IP-specific version of GIOP.) CORBA uses a predefined series of GIOP messages passed between Object Request Brokers (ORBs) to establish a reference between the calling system and an object on the called system. The ORBs on each system know how to process the GIOP messages for the locally installed operating systems, which in turn manage the objects on the respective systems.
Once the calling system obtains a reference to a remote object, it can begin manipulating the object instance created by the called system. Since the calling system can “hold on” to the object created for it on the called system, the CORBA programming model is said to be “stateful.” The objects on either side of the wire can each maintain state information without having to store it in another location (like the file system or a database). Although most CORBA-based implementations use the highly scalable TCP/IP sockets as the wire-level protocol, the overhead of CORBA’s intrinsic automatic state maintenance means that the resulting systems themselves are less scalable. The major advantage of CORBA is the availability of ORBs on a variety of operating systems and platforms.
Where CORBA is an industry standard, DCOM is a distributed application development model developed by Microsoft and based on technology that allows two local applications to communicate on a Microsoft platform called COM, or Component Object Model. Microsoft designed DCOM to allow a developer to call an object without knowing whether the object is installed on the local system or on a remote system. The protocols and the runtime that manages them are very tightly defined. The advantage of this approach is that development time is dramatically reduced, since the DCOM runtime manages all object creation and destruction, state management, and garbage collection. DCOM also has advanced security features that allow various levels of both encryption and user context.
If the two systems that need to communicate are both running Microsoft operating systems, then you can write a very robust distributed system and do so very quickly. Unfortunately, if you need to communicate with non-Microsoft systems, you’ll need to install and configure gateways or other plumbing. You should also note that the power of DCOM to accelerate development comes at the cost of reduced performance at runtime. DCOM is also a highly stateful programming model, which makes it more difficult to create highly scalable (at an Internet level) applications.
Sun’s Java language implements a native capability to invoke methods on remote objects called Remote Method Invocation (RMI). Like CORBA’s IIOP, the remoting architecture is based on TCP/IP sockets. But unlike the other programming models, the remote object’s method definitions are dynamically downloaded to the client once the local Java client identifies the location of the remote object whose methods it seeks to invoke (from the network accessible RMI Registry). DCOM and CORBA both require that the local system’s object know about the remote object methods before they attempt to access the remote object. Since Sun designed Java to be used in a distributed TCP/IP network, it supports both stateful and stateless programming. It also has special features that make it ideal for the development of lightweight objects that can be invoked by simple browser clients (assuming the Java runtime is present).
In order for two systems to talk, they must both have an installed Java runtime. Even though Sun or third parties provide Java runtimes for most common systems, this also requires the development on both systems to be done using the Java language. As an interpreted language, it offers a high (but not total) degree of portability between systems but locks a company into a proprietary language driven by a single vendor: Sun. So the choice of Java RMI vs. DCOM boils down to a proprietary language available on multiple operating systems vs. a variety of languages creating objects that run on a proprietary operating system.
How is SOAP different?
Implementing any of these programming models requires a locally installed runtime on both ends of the conversation. And if the systems that need to communicate aren’t local, then firewalls have to be configured to allow open ports required by each of them to pass information beyond the firewall. SOAP, on the other hand, requires only that the systems have an XML parser and support a common protocol, typically HTTP (Hypertext Transfer Protocol). Of course, SOAP’s inherent support for Internet protocols comes at a cost to the developer. Remember first that SOAP is not designed to be a full-blown runtime environment but a well-defined wire-level protocol. SOAP doesn’t attempt to manage the environment at the same level of sophistication provided by CORBA, DCOM, or RMI. For example, each of these technologies has a mechanism for handling objects that get invoked but never destroyed, allowing the operating system to reclaim abandoned system resources.
Where the other technologies have various levels of implementation-specific security, SOAP has no predefined security and actually passes data between systems in clear text. And by its stateless nature, SOAP doesn’t support any callback mechanism like those provided by these more stateful oriented environments. SOAP developers must implement functionality not included in the SOAP specification in their own distributed architecture.
Luckily, IBM, Sun, and Microsoft are all implementing much of this functionality in their next-generation development environments and operating systems. For example, Microsoft’s Visual Studio.NET and HailStorm system services are examples of next-generation distributed architecture building blocks that use SOAP as their core wire protocol. And IBM is extending their WebSphere platform to be a major player in the SOAP systems development arena. These environments make it simple for developers to create applications that call other systems based on SOAP and create their own system services accessible from any other HTTP-capable client.
SOAP: Coming to a URL near you
While most corporations today are still developing marginally scalable systems that require runtimes using one of the existing distributed computing architectures, many have already begun experimenting with next-generation distributed applications developed using SOAP. Most of these early efforts are centered on taking functions whose results are exposed today as HTML on their Web pages and converting them or running them as parallel Web systems. For example, suppliers who want their customers to easily access pricing or delivery information could create URLs that accept SOAP envelopes. These remote calls—containing SOAP envelopes with the parameters required to request catalog or invoice information—can then return SOAP envelopes with the results coded as well-formed XML documents.
With this functionality implemented, trading partners can now integrate their local accounting or operational systems with those of the supplier without having to implement any proprietary runtime or operating system. Imagine the advantage this supplier has over his competitors when the supplier provides this technology to allow its customers to reduce the overall cost of delivering their goods or services. Being first to market with this technology may literally be the difference between surviving as a highly connected participant in the Internet community and being driven out of business for an insistence on maintaining these proprietary systems.
Next week, we’ll look at where SOAP came from and cover the basics of implementing a SOAP architecture using today’s available toolsets.
This is the first of four Landgrave columns on SOAP. Get your questions and comments together and e-mail us or start a discussion below.