By Ed Tittel and Dr. Bill Brogden

Cocoon is a powerful XML-based content handling environment that is part of the Open Source Apache project. As discussed in an earlier overview on Cocoon, the basic data structure that makes Cocoon work is called a pipeline, where specific components called generators (that grab or create content), transformers (that convert content from one XML document format to another), and serializers (that deliver output in specific formats for rendering or display) are hooked together to perform specific tasks.

Pipelines make it easy to grab, manipulate, and output all kinds of contents. This makes Cocoon attractive to companies who’d like to use content management and programmatic tools to publish content in multiple formats, but who may find commercial content management systems too expensive or forbidding to contemplate.

Building a Web site with Cocoon is rather like snapping Lego blocks together in that it involves plugging together standard elements to create a complete system. We’ll show you how to go about creating such a site.

Sample conditional combination
Pipelines are defined in an XML document called a sitemap, wherein specific combinations of Cocoon components may be linked to one another. As Listing A illustrates, a simple conditional combination of a generator, plus various transformers and serializers makes it easy to grab the same content and deliver it in various formats to users. My example shows that it’s pretty straightforward to grab XML content and deliver it in HTML, XHTML, or Wireless Markup Language (WML, a special limited subset of HTML designed for use on cell phones or handheld devices with Web access) as needed.

A minimal pipeline requires at least a generator and a serializer, but in practice most real pipelines involve a generator plus one or more transformers and serializers to create outputs in desired formats. In fact, the XML markup example in Listing A shows a sitemap file that uses different combinations of generator, transformer, and serializer to create output in a variety of forms, where actual behavior is managed by passing a map:match pattern value to control which pipeline instance is used.

In Listing A, we look at a simple pipeline from the current Cocoon 2.0.4 distribution. This pipeline controls creation of the classic “Hello world!” response in various formats. The elements in the map namespace shown in Listing A all define Cocoon components.

The base URL that activates this pipeline is /cocoon/samples/hello-world. The Cocoon servlet examines the URL and looks for a matching pipeline in the sitemap for the cocoon/samples/hello-world area by applying the match element to the URL. A match activates a section of pipeline between the map:match tags that is composed of a generate component connected to a transform component, which in turn connects to a serialize component. Thus the patterns hello.html, hello.xhtml, and hello.wml control which pipeline is used when the example is invoked at runtime.

The basic page described in the hello-page.xml file is read by the generate component and turned into SAX parsing events. These events are passed to the transform component, which applies an XSLT transformation that creates markup specific to the desired output format, still as a sequence of SAX events. Finally the serialize component writes the formatted page as the response to the original request (in HTML, XHTML, or WML format).

The ingenuity of this approach is apparent when you consider all the component types that can plug into a pipeline. For example, the initial content events might be generated by executing a program in a scripting language. And the final output might be in PDF format instead of the Web-oriented forms shown in the preceding XML example.

A Web designer can even plug multiple transformers into the middle of a pipeline. For example, an SQL transformer can do a database query and insert retrieved data into an XML document before an XSLT transformer converts it into HTML. There is even an i18n internationalization component that can perform basic translation from English into many languages.

For savvy Web designers who can use libraries of prefabricated standard components from all three categories and who need only occasionally customize standard components or create entirely new ones, this makes performing all kinds of sophisticated content access and manipulation functions reasonably straightforward. This helps explain the appeal of Cocoon and why it’s being applied in all kinds of situations and sites.

Standard installation

Note that with a standard Cocoon installation (with the server running on port 80), you would use these URLs to actually run the code example shown:

  • http://localhost/cocoon/samples/hello-world/hello.html
  • http://localhost/cocoon/samples/hello-world/hello.xhtml
  • http://localhost/cocoon/samples/hello-world/hello.wml