Book HomeXML in a Nutshell

6.4. DocBook

DocBook (http://www.docbook.org/) is an SGML application designed for new documents, not old ones. It's especially common in computer documentation. Several O'Reilly books have been written in DocBook, including Norm Walsh and Leonard Muellner's DocBook: The Definitive Guide. Much of the Linux Documentation Project (LDP, http://www.linuxdoc.org/) corpus is written in DocBook.

The current version of DocBook, 4.1.2, is available as both an SGML and an XML application. The XML version is not quite the same as the SGML version, but it's very close for most practical uses. The DocBook maintainers have announced plans to move to a single DTD that is completely compatible with both SGML and XML in version 5.0. Example 6-2 shows a simple DocBook XML document based on the book you're reading now. Needless to say, the full version of this document would be much longer.

Example 6-2. A DocBook document

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBk XML V4.1.2//EN"
                      "docbook/docbookx.dtd">
<book>
  <title>XML in a Nutshell</title>
  <bookinfo>
    <author>
      <firstname>Elliotte Rusty</firstname>
      <surname>Harold</surname>
    </author>
    <author>
      <firstname>W. Scott</firstname>
      <surname>Means</surname>
    </author>
  </bookinfo>

  <toc>
    <tocchap><tocentry>Introducing XML</tocentry></tocchap>
    <tocchap><tocentry>XML as a Document Format</tocentry></tocchap>
    <tocchap><tocentry>XML as a "better" HTML</tocentry></tocchap>
  </toc>

  <chapter>
    <title>Introducing XML</title>
    <para></para>
  </chapter>

  <chapter>
    <title>XML as a Document Format</title>

   <para>
     XML is first and foremost a document format. It was always intended
     for web pages, books, scholarly articles, poems, short stories,
     reference manuals, tutorials, texts, legal pleadings, contracts,
     instruction sheets, and other documents that human beings would
     read. Its use as a syntax for computer data in applications like
     syndication, order processing, object serialization, database
     exchange and backup, electronic data interchange, and so forth is
     mostly a happy accident.
   </para>

   <sect1>
     <title>SGML's Legacy</title>
     <para></para>
   </sect1>
   <sect1>
     <title>TEI</title>
     <para></para>
   </sect1>

   <sect1>
     <title>DocBook</title>
     <para>
       <ulink url="http://www.docbook.org/">DocBook</ulink>
       is an SGML application designed for new documents, not old ones.
       It's especially common in computer documentation. Several
       O'Reilly books have been written in DocBook including
       <citation>Norm Walsh and Leonard Muellner's
       <citetitle>DocBook: The Definitive
       Guide</citetitle></citation>. Much of the <ulink
       url="http://www.linuxdoc.org/">Linux Documentation Project
       (LDP)</ulink> corpus is written in DocBook. </para>
   </sect1>

  </chapter>

  <chapter>
    <title>XML on the Web</title>
    <para></para>
  </chapter>

  <index>
    <indexentry>
      <primaryie>SGML, 8, 9, 91, 92, 94</primaryie>
    </indexentry>
    <indexentry>
      <primaryie>DocBook, 97-101</primaryie>
    </indexentry>
    <indexentry>
      <primaryie>TEI, 94-97, 101</primaryie>
    </indexentry>
    <indexentry>
      <primaryie>Text Encoding Initiative</primaryie>
      <seeie>TEI</seeie>
    </indexentry>
  </index>

</book>

DocBook offers many advantages to technical authors. First and foremost, it's open, nonproprietary, and can be created with any text editor. It would feel a little silly to write open source documentation for open source software with closed and proprietary tools like Microsoft Word (which is not to say this hasn't been done). If your documents are written in DocBook, they aren't tied to any one platform, vendor, or application software. They're portable across essentially any plausible environment you can imagine.

Not only is DocBook theoretically editable with basic text editors; it's simple enough that such editing is practical as well. Of course, if you'd like a little help, there are a number of free tools available, including an Emacs major mode (http://www.nwalsh.com/emacs/docbookide/index.html). Furthermore, like many good XML applications, DocBook is modular. You can use the pieces you need and ignore the rest. If you need tables, there's a very complete tables module. If you don't need tables, you don't need to know about or use this module. Other modules cover various entity sets and equations.

DocBook is an authoring format, not a format for finished presentation. Before a DocBook document is read by a person, it should be converted to any of several formats, including the following:

For example, if you want high-quality printed documentation for a program, you can convert a DocBook document to TEX, then use the standard TEX tools to convert the resulting TEX file to a DVI and/or PostScript file and print that. If you just want to read it on your computer, then you'd probably convert it to HTML and load it into your web browser. For other purposes, you'd pick something else. With DocBook all these formats come essentially for free. It's very easy to produce multiple output documents in different formats from a single DocBook source document. Indeed, this benefit isn't just limited to DocBook. Most well-thought-out XML input formats are just as easy to publish in other formats.



Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.