Table of Contents
Modular DocBook means your content collection is broken up into smaller file modules that are recombined for publication. The advantages of modular documentation include:
The best tools for modular documentation are XIncludes and olinks. XIncludes replace the old way of doing modular files using system entities. System entities were always a problem because they cannot have a DOCTYPE declaration, and therefore cannot be valid documents on their own. This creates problems when you try to load a system entity file into a structured editor that expects to be able to validate the document. With the introduction of the XInclude feature of XML, the modular files can be valid mini documents, complete with DOCTYPE declaration. Conveniently, the module's DOCTYPE does not generate an error when its content is pulled in using the XInclude mechanism.
Olinks enable you to form cross references among your modular files. If you try to use xref
or link
to cross reference to another file module, then your mini document is no longer valid. That is because those elements use an IDREF-type attribute to form the link, and the ID it points to must be in the same document. They will be together when you assemble your modules into a larger document, but the individual mini documents will be incomplete. When you try to open such a module in a structured editor, it will complain that the document is not valid. Olinks get around this problem by not using IDREF attributes to form the cross reference. Olinks are resolved by the stylesheet at runtime, whether you are processing a single module or the assembled document. See Chapter 23, Olinking between documents for general information about using olinks, and the section “Modular cross referencing” for using olinks with modular files.
You can divide your content up into many individual valid file modules, and use XInclude to assemble them into larger valid documents. For example, you could put each chapter of a book into a separate chapter document file for writing and editing. Then you can assemble the chapters into a book for processing and publication.
Here is an annotated example of a chapter file, and a book file that includes the chapter file.
Chapter file intro.xml: <?xml version="1.0"?> <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN" "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"> <chapter id="intro"> <title>Getting Started</title> <section id="Installing"> ... </chapter> Book file: <?xml version="1.0"?> <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN" "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"> <book> <title>User Guide</title> <para>This guide shows you how to use the software.</para> <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="intro.xml" /> ... </book>
When the XInclude is resolved during the processing, the <xi:include>
element will be replaced by the included chapter
element and all of its children. It is the author's responsibility to make sure the included content is valid in the location where it is pulled in.
In one of the draft XInclude standards, the namespace URI was changed to use 2003
instead of 2001
in the name, but it was changed back to 2001
for the final standard. Some XInclude processors may not have caught the change. For example Xerces version 2.6.2 expects the XInclude namespace to use the incorrect 2003
value.
Here are some other nifty features of XInclude:
You can nest XIncludes. That means an included file can contain XIncludes to further modularize the content. This might be useful when keeping a collection of section modules that can be assembled into several different versions of a chapter. Then the chapter file is included in the larger book file.
The href
value in an XInclude can be an absolute path, a relative path, an HTTP URL that accesses a web server, or any other URI. As such, it can be mapped with XML catalog entries, as described in the section “XIncludes and XML catalogs”. A relative path is taken as relative to the document that contains the XInclude element (the including document). That is true for each of any nested includes as well, even when they are in different directories.
You can select parts of an included document instead of the whole content. See the section “Selecting part of a file” for more information.
You can include parts of the including document in order to repeat part of its content in the same document, if you do it carefully. When you omit the href
attribute, and add an xpointer
attribute, then it is interpreted as selecting from the current document. You cannot select the whole document or that part of the document that has the XInclude element, because that would be a circular reference. You also don't want to repeat content that has any id
attributes, because duplicate id values are invalid.
A document's root element can be an XInclude element. In that case, there can be only one, since a well-formed document can only have a single root element. Likewise, the included content must resolve to a single element, with its children.
The XInclude standard permits you to select part of a file for inclusion instead of the whole file. That is something that system entities were never able to do. In a modular source setup, that means you don't have to break out into a separate file every single piece of text that you want to include somewhere . You can organize your modules into logical units for writing and editing, and the select from within a file if you need just a piece of a module.
The simplest syntax just has an id value in an xpointer
attribute. The following is an example.
<xi:include
href="intro.xml
xpointer="Installing"
xmlns:xi="http://www.w3.org/2001/XInclude" />
If the following chapter file is named intro.xml
, then this XInclude will select the section
element because it has id="Installing"
:
<?xml version="1.0"?>
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">
<chapter id="intro">
<title>Getting Started</title>
<section id="Installing">
<title>Running the installation</title>
...
</section>
</chapter>
For selections based on id, the included document must have a DOCTYPE declaration that correctly points to the DocBook DTD. It is the DTD that declares that id
attributes are of the ID type (the name id
is not sufficient). If the file doesn't have the DOCTYPE or if the DTD cannot be opened, then such references will not resolve.
Earlier draft versions of the XInclude standard used a URI fragment syntax to select part of a document, as in href="intro.xml#Installing"
. That syntax is no longer supported. Now the href
must point to a file, and you must use an xpointer
attribute to select part of it.
More complex selections can be made using the full XPointer syntax. Several XPointer schemes are defined, not all of which are supported by every XInclude processor. Each scheme has a fixed name followed in parentheses by an expression appropriate to that scheme. Here are several examples that are supported by the xsltproc processor.
xpointer="element(Installing)"
, xpointer="xpointer(id('Installing'))"
These two examples of the schemes named element()
and xpointer()
are equivalent to xpointer="Installing"
. They all select a single element with an id
attribute. Be careful not to confuse the xpointer
attribute with the xpointer()
scheme name.
xpointer="element(/1/3/2)"
This example selects the second child of the third child of the root element of the included document. For example, an included document could consist of a book
root element, which contains only chapter
elements that contain only section
elements. This inclusion takes the second section of the third chapter of the book. The element()
scheme always selects a single element for inclusion.
xpointer="element(Installing/2)"
This example selects the second child of the element that has id="Installing"
in the included document. With the element()
scheme, you cannot refer to elements by element name, only by position number or id.
xpointer="xpointer(/book/chapter[3]/*)"
The xpointer()
scheme uses a subset of XPath in its expressions. In this case, it selects all of the child elements of the third chapter in the book, but it does not include the chapter
element itself. The xpointer()
scheme can select more than one element to be included.
Not all processors support all XPointer syntax in XIncludes, and the XPointer standard has not been finalized. Check the documentation of your processor to see what parts of XInclude it supports.
You can use XInclude to include plain text files as examples in your DocBook document. The XInclude element permits a parse="text"
attribute that tells the XInclude processor to treat the incoming content as plain text instead of the default XML. To ensure that it is treated as text, any characters in the included content that are special to XML are converted to their respective entities:
& becomes & < becomes < > becomes > " becomes "
All you need to do is point the href
attribute to the filename, and add the parse="text"
attribute:
<programlisting><xi:include href="codesample.c" parse="text" xmlns:xi="http://www.w3.org/2001/XInclude" /> </programlisting>
If you forget the parse="text"
attribute, you will get validation errors if the included text has any of the XML special characters.
Since the included text is not XML, you can't use an xpointer
attribute with XPointer syntax to select part of.it. You can only select the entire file's content.
But you can specify the encoding of the incoming text by adding an encoding
attribute to the XInclude element. In general a processor cannot detect what encoding is used in a text file, so be sure to indicate the encoding if it is not UTF-8. The encoding
attribute is not permitted when parse="xml"
, because the XML prolog already indicates the encoding of an XML file.
An XInclude can contain some fallback content. This permits processing to continue if an include cannot be resolved, maybe because the file does not exist or because of download problems. To use the fallback mechanism, instead of an empty xi:include
element you put a single xi:fallback
child element in it. The content of the child is used if the XInclude cannot be resolved at run time.
<xi:include href="intro.xml" xmlns:xi="http://www.w3.org/2001/XInclude"> <xi:fallback> <para><emphasis>FIXME: MISSING XINCLUDE CONTENT</emphasis></para> </xi:fallback> </xi:include>
The fallback content must be equally valid when inserted into the document for it to work. In fact, the xi:fallback
element can contain another xi:include
, which the processor will try to resolve as a secondary resource. The secondary include can also contain a secondary fallback, and so on.
Keep in mind that processing of the document does not stop when an Xinclude cannot be resolved and it has a fallback child, even if that child is empty. If you want your processing to always continue regardless of how the includes resolve, then add a fallback element to all of your XInclude elements. If, on the other hand, your Xincludes must be resolved, then don't use fallback elements on the innermost includes and let the processing fail.
Although XIncludes are intended to replace SYSTEM entities, it is still possible to use regular entities with XInclude. You can declare regular entities for filenames in a file's DOCTYPE declaration, and then use an entity reference in the href
attribute of an XInclude element. That let's you declare all the pathname information at the top of the file, where it can be more easily managed than scattered throughout the file in various includes. The example above could be reworked in the following way:
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN" "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd" [ <!ENTITY intro "part1/intro.xml"> <!ENTITY basics "part1/getting_started.xml"> <!ENTITY config "admin/configuring_the_server.xml"> <!ENTITY advanced "admin/advanced_user_moves.xml"> ]> <book> <title>User Guide</title> <para>This guide shows you how to use the software.</para> <xi:include href="&intro;" xmlns:xi="http://www.w3.org/2001/XInclude"/> <xi:include href="&basics;" xmlns:xi="http://www.w3.org/2001/XInclude"/> <xi:include href="&config;" xmlns:xi="http://www.w3.org/2001/XInclude"/> <xi:include href="&advanced;" xmlns:xi="http://www.w3.org/2001/XInclude"/> ... </book>
You could also declare all the entities in a central file, and then use a parameter system entity to pull the declarations into all of your documents. See the section “Shared text entities” for an example.
Since the href
attribute of an XInclude element contains a URI, it can be remapped with an XML catalog. That setup would let you enter somewhat generic references in your XIncludes, and then let the catalog resolve them to specific locations on a given system. See Chapter 4, XML catalogs for more information on setting up catalogs.
For example, the following XIncludes use mythical pathnames that don't exist in the file system as they are written.
Example 22.1. XInclude and XML catalog
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN" "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd" <book> <title>User Guide</title> <para>This guide shows you how to use the software.</para> <xi:include href="file:///basics/intro.xml" xmlns:xi="http://www.w3.org/2001/XInclude" /> <xi:include href="file:///basics/getting_started.xml" xmlns:xi="http://www.w3.org/2001/XInclude" /> <xi:include href="file:///admin/configuring_the_server.xml "xmlns:xi="http://www.w3.org/2001/XInclude" /> <xi:include href="file:///user/advanced_user_moves.xml" xmlns:xi="http://www.w3.org/2001/XInclude" /> ... </book>
This XML catalog can be used to map these mythical pathnames to real file locations on either the local system or a remote system using a URL.
<?xml version="1.0"?> <!DOCTYPE catalog PUBLIC "-//OASIS/DTD Entity Resolution XML Catalog V1.0//EN" "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"> <catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"> <rewriteURI uriStartString="file:///basics/" rewritePrefix="file:///usr/share/docsource/modules/IntroMaterial/" /> <rewriteURI uriStartString="file:///admin/" rewritePrefix="http://myhost.mydomain.net:1482/library/administration/" /> <rewriteURI uriStartString="file:///user/" rewritePrefix="http://myhost.mydomain.net:1482/cgi-bin/getmodule?" /> </catalog>
The resource being included could even be the output of a CGI request, as in the last example above. The href
value would resolve in the catalog to http://myhost.mydomain.net:1482/cgi-bin/getmodule?advanced_user_moves.xml
. Here getmodule
could be a CGI script that pulls content from a database or version control system based on the query string submitted. Of course, processing a file that includes such content from the network relies on the resource being available at the time of processing.
Once you break your content up into modules, you may find it desirable to create a hierarchy of directories to organize the modules. You no longer have to organize the source files according to the content flow in a publication. Rather, you are free to organize the modules on your file system in any way that facilitates the management of the content, such as by chapter, subject matter, user level, author, or whatever. Your publications can use XInclude to pick and choose from among your directory hierarchy to assemble the content.
An XIncluded file can contain other XIncludes, which allows you to create a nested hierarchy of XIncludes to build a publication. They can be nested to whatever depth is necessary. The nesting of XIncludes can be completely independent of the nesting of directories. When the master document is assembled by the parser, each sequence of XIncludes is followed through the directory hierarchy to locate the content.
The href
in an XInclude can be either an absolute URI or a relative URI. Absolute URIs are unambiguous, but not very portable. That is, if you move the content to another machine that has a different base location, all of the addresses will be wrong. But absolute URIs can be made portable by using an XML catalog as described in the previous section. The catalog can map the absolute URIs in the href
s to a different location on the new system.
If you use relative URIs in your XInclude href
s, each path is taken relative to the location of the document that contains the XInclude. So each module only has to keep track of its own XIncludes, and does not have to worry about how it might be used elsewhere in the hierarchy. This means you can process an individual module for testing from its own location, and its XIncludes will work. And it means when you process the module as referenced from another XInclude, its own XIncludes will still work. This is how a modular system should work, and it does.
Relative URIs can use the "..
" syntax to indicate a parent or higher directory. The following example will XInclude a file that is located two directory levels up and one level down relative to the current file's location:
<xi:include href="../../userguide/chapter2.xml"
xmlns:xi="http://www.w3.org/2001/XInclude" />
Relative paths work best when they are kept simple. A complicated path like the preceding example indicates how flexible XIncludes can be, but don't get carried away. Remember, you have to maintain these files. If you decide to rearrange your directory hierarchy, you could end up having to fix a lot of XIncludes. You might be better off using an XML catalog with absolute URIs that the catalog can resolve. Then if you rearrange your directories, you just need to rewrite your catalog file.
The previous section describes how XIncludes with relative URIs are resolved relative to the current file. The XInclude processor can do that because it fully recognizes each XInclude element from its unique namespace attribute.
But what about relative graphics file references? An XInclude-aware parser does not automatically know that the fileref
attribute in an imagedata
element is a path that needs to be resolved relative to the current file's location. It is the stylesheet's responsibility to do that. Fortunately, the XInclude standard helps the stylesheet do that automatically by requiring the XInclude processor to insert xml:base
attributes when needed.
Here is how it works:
When the XInclude processor encounters an XInclude element, it replaces the XInclude element with the content pulled from the other file.
As it is copying the root element from the included content, it will add an xml:base
attribute to that included root element if its directory differs from the location of the current file. The xml:base
value indicates the location of the XIncluded file. Any XIncluded file from the same directory as the current file does not need an xml:base
attribute.
When the XSL stylesheet (starting with version 1.66.1) processes the document with all of its XIncludes resolved, the stylesheet uses the xml:base
attributes to help resolve any relative paths in a graphic element's fileref
. It does that by scanning back through the graphic's ancestor elements to find an xml:base
attribute. The stylesheet then prepends that to the fileref
path.
If you have used nested XIncludes in different directories, the stylesheet will continue tracing backwards through the graphic element's ancestors, looking for xml:base
attributes. The stylesheet combines them into one final path for the fileref
, which ends up being the path from the master document to the graphics file.
If a fileref
is an absolute URI, then it is used as it is, and xml:base
attributes are not added to it.
Prior to version 1.66.1 of the XSL stylesheets, xml:base
attributes were not used to resolve relative fileref
attributes. This meant XIncluded content with relative fileref
s had to be in the same directory as the main document.
The xml:base
attributes are also used to resolve relative paths in fileref
attributes in textobject
elements. See the section “External code files” for an example.
You might be wondering what happens to any entity references that appear in the included content. An entity reference such as &companyname;
must have an entity declaration in the DTD to be resolved. If your entities are all declared in an extension to the external DocBook DTD, then your main document and the modules that use that DTD will all share the same entity declarations and there is no problem.
But what if you declare an entity in the DOCTYPE of your included file? Does the declaration go along with the included content? The answer is basically yes, with some caveats.
If your main document has a DOCTYPE declaration at the top, then any entity declarations needed for the included content are copied to that DOCTYPE from the included file.
If the DOCTYPE in the main document already has an entity declaration for that name, then the declaration in the included file must match it, or else an error will be generated. There is no overriding or substitution of entity values when using XIncludes.
If there are any entity references in the included content that are not declared in the included file, then the include will fail. In other words, you can't rely on the entity declarations in the main document to expand entity references in the included text. The text in the included document is parsed before it is included, and any entity references must resolve there.
See the section “Shared text entities” for a good strategy on managing entities in a modular doc setup.
DocBook XSL: The Complete Guide - 3rd Edition | PDF version available | Copyright © 2002-2005 Sagehill Enterprises |