Introduction to the Cocoon Web-publishing framework

The complexity of developing Internet applications can be overwhelming. The myriad of alternatives only adds to the complexity. Thankfully, Web-publishing frameworks have been created to provide structure to the process. Cocoon is one such approach.

The list of Web-development-technology alternatives can be overwhelming: ASP, JSP, Servlets, PHP, and so on. However, the focus for any business is not solely on which technology gives the best performance but which one enables the company to ship a robust application in minimal time—keeping in mind that the workforce is not a collection of Einstein clones. So, we’re looking for a technology that works in a structured format, is flexible enough to suit the needs of varied applications, and (most important) has a clear distinction among various tasks.

In this article, I discuss one such technology: the Cocoon Web-publishing framework. First, let’s take a closer look at the Web-publishing framework concept.

Web-publishing framework
A Web-publishing framework is not a new technology. It is an enabler, building and integrating the power of various technologies to provide a complete and effective Web-development framework. In the world of Java-based frameworks, most seem to build on the power and flexibility of servlets, JSPs, and XML. Quite a few frameworks exist; some of the better ones include:

Most of these frameworks rely on the popular Model-View-Controller (MVC) design pattern. As per the MVC pattern, all flow is directed to a central controller. This controller delegates the requests to an appropriate handler. In turn, this handler is tied to the business logic of the system (the Model). For the response, the flow is again directed through the central controller and to the appropriate view. This process achieves loose coupling of the view and the business logic, making MVC based systems far easier to create and to maintain. Now, we turn our attention to the finer aspects of Cocoon.

What is Cocoon?
Apache Cocoon is a publishing framework that uses and leverages the power of XML. This claim of “leveraging the power of XML” has become rather banal lately, and almost every new software claims to do just that. However, Cocoon relies heavily on XML, and the beauty of the framework lies in its smart XML utilization.

All you need is some basic knowledge of XML, XSLT, and Java to get started with Cocoon. However, the degree of Java and XSLT understanding necessary varies, based upon the complexity of the application being developed.

Cocoon is part of, and you can easily download a binary distribution or pick up the latest release from the Apache CVS.

The Cocoon documentation claims that the single most important innovation Cocoon has made is its SoC (Separation of Concerns) design. This enables isolation of the four major concern areas for Web publishing:
  • Management
  • Logic
  • Content
  • Style

With these elements no longer tied to each other, tasks can be allocated to people who are good at one element; they don’t need to understand the others. So, a programmer need not worry about the style of the site or a Web designer about the logic involved.

How Cocoon works
The entire functioning of Cocoon 2 is based on one key concept: component pipelines. As the name suggests, the pipeline connotes a series of events, which consists of taking a request as input, processing and transforming it, and then giving the desired response (see Figure A).

The pipeline components are generators, transformers, and serializers:
  • The generator is responsible for taking in a request and creating an XML structure from an input source.
  • The output of the generators is in XML. This output is converted into anther XML structure using a transformer. One or more transformations are possible to get the desired XML output. The most commonly used transformer is the XSLT transformer.
  • Now that you are done with the transformation, you need to get a response in the desired format. The serializers handle this task. The input to a serializer is XML; however, the output need not be XML. HTML Serializer is used most often to generate Web pages. Serializers need not always be invoked from the transformer; a generator could also directly invoke the serializer.

Figure A
A basic Cocoon flow

SAX events are passed between these three components. Two other components of Cocoon play a crucial part in its functionality: the sitemap and matchers.

The sitemap is often referred to as the heart of Cocoon. It primarily consists of declarations for pipelines, components, and resources. Whenever a request comes in, this is where it all begins. Without this map, Cocoon would be unable to decipher how any request should be tackled or how to find the desired resource.

Sitemap management should be handled by the site manager and should not be of much concern to the programmer. Pipelines should be the only things you need to interact with. You can begin creating your own resources or components at a later stage.

The best way to understand what the sitemap does is to open the sitemap.xmap file in your favorite XML editor and study and fiddle with the existing XML structure, or create your own bits of XML.

Matchers are powerful sitemap components. The sitemap performs matching based on the matchers to identify which request has come in and which resource is to be utilized. When a request comes in, based on the first correct match, the processing takes place. The power of matchers lies in their strong support for wildcards and regular expressions, making the creation of a sitemap a relatively simple task.

What else does Cocoon offer?
Besides the systematic and clearly demarcated approach that Cocoon brings to Web publishing, it also makes development much easier. Cocoon brings with it a host of serializers, generators, and other useful gizmos. So, creating an Acrobat .pdf file or creating Scalable Vector Graphics (SVG) from XML takes just a simple transformation. Because Cocoon relies on XML, it also enjoys the independence that comes with it. Thus, creating or adapting applications for various devices like WAP, Voice, and so on is greatly simplified. Cocoon interacts easily with data sources like RDBMS, LDAP, and native XML databases.

An important spin-off of the Cocoon project is XSP (extensible Server Pages). XSP’s goal is to deliver where JSP might have failed. XSP is supposed to achieve separation of logic from the presentation—something that JSP tries but falls short of.

On the flip side, it takes some time to become accustomed to Cocoon. Having a workforce in place that understands the “hows” and “whats” of Cocoon is no easy task. Considering the relatively easy availability of JSP developers and JSP’s widespread acceptance, basic JSPs with proper usage of tag libraries could at times be a better option than using XSP with Cocoon. Cocoon also relies on many other Apache projects for bits and pieces of functionality.

Cocoon is a great alternative, worthy of being seriously considered for Web publishing by businesses of all sizes. It also boasts high traffic and high membership mailing lists; more often than not, your queries will be answered in no time at or, for the more technologically inclined,

Editor's Picks