Book HomeJava and XML, 2nd Edition

10.5. Cocoon 2.0 and Beyond

Cocoon 2.0, the next generation of Cocoon, promises to be a giant leap forward for the web publishing framework. Cocoon 1.x, which is primarily based on XML being transformed via XSL, still has some serious limitations. First, it does not significantly reduce the management costs of large sites. While one XML document can be transformed into different client views, a significant number of documents will still exist. Generally, either long URIs (such as /content/publishing/books/javaxml/contents.xml), a large number of virtual path mappings (/javaxml mapped to /content/publishing/books/javaxml), or a combination of the two result. In addition, a strict separation of presentation from content from logic is difficult to accomplish, and even more difficult to manage.

Cocoon 2 focuses on enforcing the contracts between these different layers, therefore reducing management costs. XSP is a centerpiece in this design. In addition, the sitemap allows the distinction between XSP, XML, and static HTML pages to be hidden from the prying user. Advanced precompilation and memory considerations will also be introduced to make Cocoon 2 even more of an advance over Cocoon 1.x than Cocoon 1.x was over a standard web server.

10.5.1. Servlet Engine Mappings

A significant change in Cocoon 2 is that it no longer requires a simple mapping for XML documents. While this works well in the 1.x model, it still leaves management of non-XML documents to the webmaster, possibly someone completely different from the person responsible for the XML documents. Cocoon 2 seeks to take over management of the entire web site. For this reason, the main Cocoon servlet (org.apache.cocoon.servlet.CocoonServlet in the 2.0 model) is generally mapped to a URI, such as /Cocoon. This could also be mapped to the root of the web server itself (simply "/") to completely control a site. The URL requested then follows the servlet mapping: http://myHost.com/Cocoon/myPage.xml or http://myHost.com/Cocoon/myDynamicPage.xsp, for example.

With this mapping in place, even static HTML documents can be grouped with XML documents, allowing the management of all files on the server to be handled by a central person or group. If HTML, WML, and XML documents must be mixed in a directory, no confusion needs to occur, and uniform URIs can be used. Cocoon 2 will happily serve HTML as well as any other document type; with a mapping from the root of a server to Cocoon, the web publishing framework actually becomes invisible to the client.

10.5.2. The Sitemap

Another important introduction to Cocoon 2 is the sitemap. In Cocoon, a sitemap provides a central location for administration of a web site. Cocoon uses this sitemap to decide how to process the request URIs it receives. For example, when Cocoon receives a request like http://myCocoonSite.com/Cocoon/javaxml/chapterOne.html, the Cocoon servlet dissects the request and determines that the actual URI requested is /javaxml/chapterOne.html. However, suppose that the file chapterOne.html should map not to a static HTML file, but to the transformation of an XML document (as in the earlier examples). The sitemap can handle this, quite easily! Take a look at the sitemap shown in Example 10-12.

Example 10-12. Sample Cocoon 2 sitemap

  <sitemap>
   <process match="/javaxml/*.html">
    <generator type="file" src="/docs/javaxml/*.xml"
    <filter type="xslt">
     <parameter name="stylesheet" value="/styles/JavaXML.html.xsl"/>
    </filter>
    <serializer type="html"/>
   </process>

   <process match="/javaxml/*.pdf">
    <generator type="file" src="/docs/javaxml/*.xml"
    <filter type="xslt">
     <parameter name="stylesheet" value="/styles/JavaXML.pdf.xsl"/>
    </filter>
    <serializer type="fop"/>
   </process>
  </sitemap>

In this example, Cocoon matches the URI /javaxml/chapterOne.html to the sitemap directive /javaxml/*.html. It determines that this is an actual file, and the source for the file should be determined by using the mapping /docs/javaxml/*. xml, which translates to /docs/javaxml/chapterOne.xml (the filename we want transformed). The XSLT filter is then applied; the stylesheet to use, JavaXML.html.xsl, is also specified in the sitemap. The resulting transformation is then displayed to the user. In addition, the XML file could be an XSP file processed before being converted to XML and then styled.

This same process can render a PDF from the request http://myCocoonSite.com/Cocoon/javaxml/chapterOne.pdf, all with a few extra lines in the sitemap (shown in the previous example). The processing instructions in the individual XML documents can be completely removed, a significant change from Cocoon 1.x. First, uniform application of stylesheets and processing can occur based on a directory location. Simply creating XML and placing it in the /docs/javaxml/ directory in the example means the document can be accessed as HTML or PDF. It is also trivial to change the stylesheet used for all documents, something very difficult and tedious to do in Cocoon 1.x. Instead of making a change to each XML document, only the single line in the sitemap needs to be changed.

The Cocoon sitemap is still being developed, and there will probably be quite a few additional enhancements and changes to its format and structure by the time Cocoon 2.0 goes final. To get involved, join the mailing lists at and . The Apache XML project at http://xml.apache.org has details about how to participate in these lists and the Cocoon project.

10.5.3. Producers and Processors

One final improvement that Cocoon 2 will include is precompiled and event-based producers and processors. In Cocoon, a producer handles the transformation of a request URI into an XML document stream. A processor then takes an input stream (currently the XML document in a DOM tree) into output readable by the client. I haven't covered producers and processors in the Cocoon 1.x model because they are going to drastically change in the Cocoon 2.0 model; any producers and processors currently being used will most likely be useless and have to be rewritten in Cocoon 2.0.

Cocoon 2 moves from using DOM for these structures to using the more event-based SAX, wrapped within a DOM structure. As a producer in 1.x had to generate an XML document in memory, the corresponding DOM structure could get extremely large. This eventually drained system resources, particularly when performing complex tasks such as large transformations or handling formatting objects (PDF generation). For these reasons, DOM will be a simple wrapper around SAX-based events in Cocoon 2, allowing producers and processors to be very slim and efficient.

In addition, producers and processors will be precompiled versions of other formats. For example, XSL stylesheets can be precompiled into processors, and XSP pages can be precompiled into producers. This further increases performance while removing load from the client. These and other changes continue to use a component model, allowing Cocoon to be a very flexible, very pluggable framework. Keep up on the latest changes by monitoring the Cocoon web site.



Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.