Cocoon Seperates Content from Style
One challenge a Web developer faces is providing dynamic content in a way that obviates the need to change code when a new browser comes out or a new platform is targeted. Meeting this challenge becomes more difficult as marketing departments and users continuously request new features or a new look for Web applications. Web publishing frameworks are one answer to this challenge.
If you have done much reading about XML, you have undoubtedly learned that it allows you to separate content from style. Following this bold statement, you'll rarely find a practical example that shows how this separation is accomplished. Cocoon, which is a 100 percent pure Java XML publishing framework, is an outstanding example. Cocoon uses dynamic Extensible Stylesheet Language Transformations (XSLT) to serve documents to a browser, handheld device, telephone, PDF, you name it--all from a single-source XML document. In addition, Cocoon is fast. The current version has an intelligent cache that speeds up processing. The next version of Cocoon is faster yet, s using precompiled stylesheets and the Simple API for XML (SAX) parser.
Another outstanding feature of Cocoon is its price: free. Cocoon is a subproject of the Apache XML project, which falls under the guidance of the Apache Software Foundation. Software developed for the various Apache projects is commercial grade and widely deployed. The most notable of these is the Apache Web server. Like all Apache Software Foundation software, Cocoon is open source and developed by volunteers.
Separating Style, Content, and Logic
The Cocoon project started in 1999. The original version of Cocoon was a learning tool that used a servlet to apply static Extensible Stylesheet Language (XSL) styling. Since then, Cocoon has evolved into a capable XML publishing framework.
One of HTML's drawbacks is that it mixes content with formatting information. For example, the HTML displaying information as a table uses markup that defines the information based on its presentation structure. That markup would include tags like <table>, <tr>, <td>, </td>, </tr>, and </table>. Another HTML drawback is its limited style support. Originally, HTML forced you to include all style-related markup directly in an HTML document. More recently, it became possible to use Cascading Style Sheets (CSS) to encapsulate some HTML formatting elements.
However, XML, with the help of Cocoon, addresses HTML's shortcomings. This combination allows you to use markup that accurately describes the content and completely separates content from presentation structure and style. With XML, you have tags like <customer> and <product>. Because of these tags, the markup helps to enhance the meaning of the content, enabling things like context-sensitive searching and context-related selection. You can also apply much richer styling to XML documents using a combination of XSLT and XSL Formatting Objects (XSL-FO).
Design patterns describe commonly-used program components, architectures, and their relationships in a way that make it easier for others to reuse the design pattern. The current release of Cocoon, Version 1.8.2, uses the Reactor design pattern. The main components used in this Reactor implementation are a request wrapper, producer, reactor, formatter, response, and loader. Upon receipt, the request wrapper encapsulates the request; next, the producer returns the associated XML document. The reactor then dynamically processes the document based on mappings that match document content to processors. If necessary, the reactor applies recursive processes. The formatter then transforms the output into a format suitable for the client that is encapsulated in a response. If necessary, the loader formats executable code and compiled Extensible Server Pages (XSPs).
As I mentioned earlier, Cocoon was originally a learning tool designed and built using the tools available in 1999. Since then, XML and the tools that support XML have advanced rapidly. The next version of Cocoon, Cocoon 2 (currently in beta) uses a new architecture that takes advantage of these advancements. The majority of these changes focus on performance, scalability, and management. Addressing performance and scalability, Cocoon 2 switches from a Document Object Model (DOM) parser to a more efficient SAX parser. A parser is an application that interprets data in an XML document and presents it to a requesting program. SAX parsers are event-driven and use less memory than the replaced DOM parsers. Although not as significant on the iSeries as on other platforms, this change reduces memory requirements for large documents and reduces translation time.
Site management is where Cocoon 2 differs significantly from Version 1.8.2. Instead of using the Reactor design pattern, Cocoon 2 relies on an event-driven pipeline-mapping model. This change came about with the realization that the relationships between processors and documents is limited, and that the number of relationships does not grow significantly as a site grows in size and complexity. With Cocoon 2, document generators trigger events handled by various processors, serializing the results to form the response stream. A sitemap defines these relationships and matches requests to the proper response-producing pipeline. Cocoon 2 compiles sitemaps into Java classes at startup to further enhance performance.
Get Your Copy Today!
To download or find out more about Cocoon, go to the Apache Software Foundation's XML project page. In addition to Cocoon, this page contains an overview of the various Apache XML subprojects, including the following technologies used by Cocoon:
- Xerces -- An XML parser originally developed by IBM as XML Parser for Java, and donated to the Apache Software Foundation
- Xalan -- An XSLT stylesheet processor developed by Lotus as LotusXSL
- FOP -- An XSL-FO print formatter used to transform XML documents to PDF
The easiest way to obtain and install Cocoon is to download the latest binary release. At this time, the latest stable release of Cocoon is Version 1.8.2. The target release date for Cocoon 2 is early fourth quarter of this year, although currently many people are successfully running pre-release versions.
In order to run Cocoon, you have to have a servlet engine that supports Version 2 or later of the servlet API. There are several servlet engines available on the iSeries, including IBM's WebSphere, The Apache Software Foundation's Tomcat, and BEA's WebLogic. If you are not already using a servlet engine on your system, I suggest you start with Tomcat. I hope to post more detailed informatoin that describes how to set up Tomcat -- Watch this space.
Once you have a suitable servlet engine installed, you are ready to install Cocoon. These instructions describe the installation of Cocoon with Tomcat. The instructions for WebSphere are similar. On my system, Tomcat is installed off the root in a directory called tomcat. If the majority of your site's content is dynamic and serviced by servlets, you might consider configuring Tomcat to listen on port 80 rather than its default port 8080. The most important thing to be aware of with WebSphere is that installing the servlet 2.2 JAR that comes with Cocoon will break WebSphere because WebSphere uses a non-standard servlet API.
Go to the Cocoon download site. Now select the Cocoon-1.8.2.zip link, which will download a zip file containing the full binary release. The 1.8.2 binary distribution contains quite a few JAR files from other Apache projects. In most cases, these are not the latest version; however, they have been tested together and work. Upgrading to a more recent version of any of these JAR files is likely to expose cascading dependencies that require further upgrades.
One way to get Cocoon onto your system is to send the Cocoon-1.8.2.zip file to your iSeries using FTP. If you choose this route, you will need to use the JAR utility from Qshell to extract Cocoon's files. To copy the zip file to your iSeries, start FTP and connect to your iSeries. Next, replacing c:\temp with the location of the zip file on your workstation and /tmp with the target directory on your iSeries, type in the following commands: bin, quote site namefmt 1, and put c:\temp\cocoon-1.8.2.zip /tmp/cocoon-1.8.2.zip. Once you have the zip file on your iSeries, start a Qshell session using the command QSH. Change to the OS/400 Integrated File System (OS/400 IFS) root and extract the zip file using the following commands. Wait for the dollar sign ($) that indicates that the command is complete.
jar -xf /tmp/cocoon-1.8.2.zip
Rather than using FTP and the JAR utility, I set up Tomcat and Cocoon to run on my workstation before I installed it to my iSeries and mapped an AS/400 NetServer drive to Tomcat's installation directory using Operations Navigator. Then I used drag and drop to copy the installation to the iSeries. This reduces the amount of data transferred to the iSeries and gives you a personal test environment that won't interfere with production work or other programmers.
Either way, once you have Cocoon extracted on either your workstation or iSeries, it is time to copy Cocoon's JAR and configuration, as well as the Samples directory to the Tomcat installation. First, copy cocoon.jar from Cocoon's bin directory to Tomcat's lib directory. Next, copy the bsfengines, bsf, fop_0_15_0, turbine-pool, w3c, xalan_1_2_2, and xerces_1_2_3 JAR files to Tomcat's lib directory.
After copying the required JAR files into Tomcat's lib directory, you need to compile them. Although this step is not required, compiling will improve performance. These compiles can take a few hours so submit the compiles to batch using a command like the following for each JAR file:
SBMJOB CMD(CRTJVAPGM CLSF('/tomcat/lib/bsfengines.jar') OPTIMIZE(40)) JOB(bsfengines)
While you are waiting for the JAR files to compile, set up Cocoon and Tomcat's configuration files. Start by creating two new directories where you will place .xml files served by Tomcat. Use the following commands:
MKDIR DIR('/tomcat/webapps/cocoon') MKDIR DIR('/tomcat/webapps/cocoon/WEB-INF')
Now copy Cocoon's configuration files into the new WEB-INF directory. Use drag and drop from your Cocoon installation on your workstation or the following commands if you extracted Cocoon to your iSeries:
If the directory names from the Tomcat installation on your workstation are different or if you extracted Cocoon to your iSeries, you will need to identify where the cocoon.properties file is located in the web.xml file. To do this, replace [path-to-cocoon]/conf with WEB-INF. Use the EDIT FILE (EDTF) command:
The result should be a line like this:
Now you need to let Tomcat know what files the Cocoon servlet handles. Use the EDTF command:
Add the following lines in the section titled Special webapps:
If you want to run the samples that come with Cocoon, copy the Samples directory to Tomcat's webapps/cocoon directory. For security reasons, if Cocoon is running on a production system, you should remove the Samples directory when you are done testing. If you extracted Cocoon to your iSeries, use the following command in Qshell to copy the samples to your Tomcat installation:
cp -R /cocoon-1.8.2/samples /tomcat/webapps/cocoon
The last step is to modify the Tomcat startup script to add the new JAR files. You have to make sure that the Xerces JAR file is first in the classpath. If you are using the tomcat400.sh script from Don's Tomcat article, replace the line that reads CLASSPATH=./ with one that reads CLASSPATH=$TOMCAT_HOME/lib/xerces_1_2.jar:./. This will duplicate the Xerces parser JAR file on your classpath and ensure that Xerces is ahead of any other parser.
These are the steps needed to install Cocoon on the iSeries. Once the JAR files have finished compiling, you are ready to try out Cocoon.
Unwrapping Your Content
The first document you should open is Cocoon's test document. This document tests out your Cocoon configuration and exercises all of Cocoon's JAR files. Go to a browser and type the following case-sensitive URL into the address field (assuming you are running Tomcat on the default port of 8080):
This is what the resulting page will look like in your browser. If you get an error page that refers to the XSP processor and No Such Method Found, double-check that the Xerces parser is first in your classpath. If you get an error that refers to the Java utility package zip class or an invalid Unicode character, check if your compiles have completed, and if they have not, wait and try again. After receiving one of these errors, you may need to clear your browser cache or bounce Tomcat to force the page to be processed. In some cases, the only solution to this problem is to extract Cocoon from its JAR file using the following Qshell commands:
jar -xf cocoon.jar
Once you have the test document working, try out the samples by requesting page http://as400hostname:8080/cocoon/samples/index.xml. This page has a menu of samples that demonstrates the capabilities of Cocoon. If you edit the source for the sample documents, you will often find comments that explain how the sample works and how to adapt it to different environments.
One example that I ended up adapting creates a PDF file from an XML document that contains no embedded formatting. That example, which is described in the "Publishing RPG IV-generated Invoices with Cocoon," demonstrates how to use XSLT to transform the document so that it includes formatting tags and then uses FOP to render a PDF document.
One concern that I have heard about XML publishing is that the tools to edit and maintain XML documents are not yet mature. Recently, tools have emerged that eliminate this concern. One of those tools is <B>SoftQuad's<B> XMetaL. The XMetaL editor supports several views that hide most of the complexities of editing XML documents. You can also use document type definitions (DTDs) and schemas that apply rules that ensure your XML documents are valid and match the requirements of your stylesheets.
If you are just starting out, any text editor will work to edit XML documents; but, you are responsible for ensuring that your XML document is well-formed and valid. Documents that are well-formed conform to XML's syntax; documents that are valid comply with a DTD. There are a few free tools that go beyond a text editor and ensure your XML documents are well-formed. One of these is IBM's validating editor, Xeena that is available on their alphaWorks Web site. You can download Xeena from IBM's alphaworks site.
Another way to create content is from ILE programs. You don't have to use C either, the iSeries supplies UNIX-compatibility APIs that make it possible to work with OS/400 IFS files directly in RPG IV. I wrote some wrappers that simplify the use of those APIs, which are available at this site.
Customizing Your Cocoon Installation
Because Cocoon is an open-source project, you can obtain the source, including bug fixes, from the Cocoon download page or <B>Concurrent Versions System<B>--CVS (I will explain CVS in a moment). The Cocoon download page has links to the major releases of Cocoon; however, to obtain a bug fix or the latest version or to compare versions, you need to use CVS. CVS is a server that stores source and binary files that you can check out using a CVS Client. There are CVS clients for Windows, UNIX, and Mac. To find out more about CVS, go to the official CVS Web site.
If you use VisualAge for Java, a few tricks will help you work with the source version of Cocoon. First, I would recommend importing all of the JAR files associated with Cocoon into separate projects. This will make it easier to upgrade or switch to source versions of the various JAR files. At the very least, you will need the Cocoon source and FOP, Xerces, Xalan, servlet, Turbine, and World Wide Web Consortium (W3C) JAR files. Start by importing the JAR files followed by the Cocoon source. To fix the Class Not Found errors caused by missing W3C classes, download the latest W3C sources from CVS. The module (directory) is xml-batik/sources/org/w3c.
Wrapping It Up
As you can see, Cocoon provides a powerful publishing framework that stacks up well against commercial applications. The Java background of Cocoon positions it to take advantage of new XML advances. This reliance on Java also allows Cocoon to run on the iSeries, allowing you to take advantage of the strengths of the iSeries' Java Virtual Machine (JVM). The next version of Cocoon promises to be even better at helping you manage your Web site and its content.
Installing Cocoon is not a trivial task. As Cocoon matures and is more widely deployed, I expect these steps will get simpler. In addition, keeping with the spirit of open-source projects, I hope that what you learn is passed on to other iSeries developers.
References and Related Materials: