When most people hear the phrase “Apache,” they think of the Apache Web server, also commonly referred to as “Apache HTTPD.” However, the Apache Software Foundation has a number of projects that are just as interesting as its flagship Web server. The core technologies Apache supports center on Java and XML, with a contributor list including Sun (Project X) and IBM (SOAP implementation).

The beginnings of Apache software
The Apache Software Foundation launched several XML and Java projects in 1999, at the same time Roy T. Fielding incorporated the organization as a nonprofit corporation. Apache encourages developers to use its software by supporting standard languages such as Perl, PHP (an Apache project), and, more recently, Java. Even though the Apache Software Foundation has no PHP subprojects, many commerce and business applications have been built with PHP running on top of Apache’s Web server.

I’m currently working with a number of Apache programs because they are free, have decent documentation, and are stable enough to move into a production environment. For developers working with XML and Java, projects from Apache are a great place to explore these technologies. Newer projects, including Cocoon and Xindice, offer developers a chance to try the software without any vendor hype or pressure.

The major Apache Software Foundation XML and Java projects that I’ve experimented with include the following:

  • PHP—The PHP Hypertext Preprocessor project develops the popular PHP programming language.
  • Jakarta/Tomcat—The Jakarta project contains many subprojects, with Tomcat as the centerpiece. Tomcat is a reference implementation of a J2EE servlet engine for running Java servlets and JSP pages. Tomcat 4.0 implements better support for J2EE’s context environments and data sources. There’s nothing fancy about Tomcat—it contains only enough to meet the J2EE specification.
  • Xerces and Xalan—These projects encompass an XML parser and XPath engine that again were designed to implement W3C standards for XML. Java and C++ versions of both tools are available. The current version of Xerces is 2.0.2. XML parsers all generally follow the same rules, so developers who have worked with Microsoft’s XML components will find Xerces/Xalan surprisingly familiar.
  • Axis—Donated by IBM, Axis is a SOAP Web service implementation. With version 2.3, Java-based Axis is straightforward and can be implemented with any servlet container or Java program. Axis includes command-line utilities to create WSDL files and Java Proxy clients. Many commercial vendors tout how their products simplify Web services, but the Axis command-line utilities do just as good a job.
  • Cocoon—This project is an XML-based content management platform built on top of Tomcat and Xerces. Cocoon had some problems getting out the door at Apache, but after some rewriting and a new direction, it’s well worth a look. Cocoon uses primarily XML/XSLT, along with a framework to incorporate data from disparate sources.
  • Xindice—As XML-based applications grow, the need to better manage XML documents is desperately needed. Xindice uses the XML:DB standard (not yet a W3C candidate) to maintain collections of XML documents that can be easily queried using XPath.
  • Jakarta/James and JetSpeed—James is a scalable e-mail server written completely in Java; JetSpeed is an Enterprise Information Portal. James doesn’t support IMAP4 standards, but it does handle POP3 and SMTP.

Supporting W3C standards
The W3C publishes standards for anything Web-related. Apache has done a great job to make sure its projects can be used as references for W3C standards. Management and project contributors constantly contribute to specifications and discuss how these specifications will affect users. Each project at Apache has its own mailing list, and many are archived online.

The value of Apache projects
All these packages require some time to learn; fortunately, most of the documentation is pretty good. If you have deployed a couple of Web servers and understand how they work, Tomcat is easy to set up. The Xindice documentation is a little out of date, and Xerces and Xalan require an understanding of XML. If you have never worked with XML, mastering Xerces will take quite a bit of time for research, reading the documentation, and possibly going to the W3C or even Sun for a Java-related issue.

In some cases, this type of reading and discovery is time-consuming but leads to a better understanding of core technical issues. Of course, you can just quickly modify an included demo and claim that everything works great, which is possible (but not necessarily advisable) with Apache projects.

With its strong leadership position in the developer community, open source Apache software is constantly improving. Vendors can modify and add functions to the source code, then repackage and even sell the code with their own applications. Macromedia has used Xerces and Axis out of the box and included the projects as standard components for its JRun software. IBM has also expanded upon these technologies and included them with WebSphere. So Apache’s projects defy categorization as being purely open source; you’re likely to encounter them, even in the most conservative shop.