Table of Contents
OASIS OpenDocument Essentials introduces you to the XML that serves as an internal format for office applications. OpenDocument is the native format for OpenOffice.org, an open source, cross-platform office suite, and KOffice, an office suite for KDE (the K desktop environment).
You should read this book if you want to extract data from OpenDocument files, convert your data to OpenDocument format, or simply find out how the format works.
If you need to know absolutely everything about the OpenDocument format, you should download the Open Document Format for Office Applications (OpenDocument) 1.0 in PDF form from http://www.oasis-open.org/committees/download.php/12572/OpenDocument-v1.0-os.pdf or as an OpenOffice.org 1.0 format file from http://www.oasis-open.org/committees/download.php/12028/office-spec-1.0-cd-3.sxw.. That document was a major source of reference for this book.
If you simply want to use OpenOffice.org or KOffice to create documents, you need only download the software from http://www.openoffice.org/ or http://www.koffice.org/ and start using it. There’s no need for you to know what’s going on behind the scenes unless you wish to satisfy your lively intellectual curiosity.
The examples in this book are written using a variety of tools and languages. I prefer to use open-source tools which work cross-platform, so most of the programming examples will be in Perl or Java. I use the Xalan XSLT processor, which you may find at http://xml.apache.org. All the examples in this book have been tested with OpenOffice.org version 1.9.100, Perl 5.8.0, and Xalan-J 2.6.0 on a Linux system using the SuSE 9.2 distribution. This is not to slight any other applications that use OpenDocument (such as KOffice) nor any other operating systems (MacOS X or Windows); it’s just that I used the tools at hand.
This chapter tells you how a document in OpenDocument format is stored and what its major components are.
This chapter explains the XML elements that describe meta-information (information about the document), style information, and various settings associated with a document in OpenDocument format. It also describes the general structure of the file that contains a document’s content.
This chapter tells you how text documents handle character, paragraph, and section formatting. It also describes bulleted and numbered lists, and outline numbering.
This chapter covers frames, images, fields, footnotes, tracking changes, and tables in text documents.
Spreadsheets have a great deal in common with tables; this chapter points out the similarities and differences. It also covers topics such as formulas and content validation.
This chapter explains the OpenDocument elements for basic shapes such as lines, rectangles, circles, etc.; stroke and fill properties; 3-D elements and text animation.
Text and drawings are at the heart of a presentation; this chapter covers the elements used to add backgrounds, transitions, and sound.
The OpenDocument format has elements that allow you to represent charts based on data in your spreadsheets. This chapter describes the elements for chart titles, legends, axes and tickmarks.
You don’t have to create a stand-alone application to transform XML files to OpenDocument format. In this chapter, you’ll find out how to make an import filter that integrates your transformations into the OpenOffice.org application.
XML, the Extensible Markup Language, is the “native language” of OpenOffice.org. If you haven’t used XML before, you should read this appendix to familiarize yourself with this remarkably powerful and flexible format for structuring data and documents.
XSLT is an XML markup language that describes how to transform an input XML document to an output document, which may be either plain text or XML. XSLT makes it easy to have a single document serve many purposes. This appendix is a brief introduction to this powerful language.
This appendix contains utility programs that we created while writing this book. They made it easier for us to manipulate OpenDocument files, and we hope they do the same for you.
Constant Width is used for code examples and fragments.
Constant width bold is used to highlight a section of code being discussed in the text.
Constant width italic is used for replaceable elements in code examples.
Names of XML elements will be set in constant width enclosed in angle brackets, as in the <office:document> element. Attribute names and values will be in constant width, as in the fo:font-size attribute with a value of 0.5cm.
This book uses callouts to denote “points of interest” in code listings. A callout is shown as a white number in a black circle; the corresponding number after the listing gives an explanation. Here’s an example:
Roses are red, Violets are blue. Some poems rhyme; This one doesn’t.
Please address comments and questions concerning this book to the publisher:
O’Reilly & Associates, Inc. 101 Morris Street Sebastopol, CA 95472 1-800-998-9938 (in the United States or Canada) 1-707-829-0515 (international/local) 1-707-829-0104 (fax)
The author has a web page for this book, where he lists errata, examples, or any additional information. You can access this page at:
http://books.evc-cit.info/odbook/
For more information about O’Reilly & Associates books, conferences, software, Resource Centers, and the O’Reilly Network, see the web site at:
http://www.oreilly.com
Thanks to Simon St. Laurent, the original editor of this book, who thought it would be a good idea and encouraged me to write it. Thanks also to Erwin Tenhumberg, who suggested that I update the book from the original OpenOffice.org version to the current description of OpenDocument. Thanks also to Adam Moore, who converted the original HTMl files to OpenOffice.org format, and to Jean Hollis Weber, who assisted with final layout and proofreading. Edd Dumbill wrote the document which I modified slightly to create Appendix A. Of course, any errors in that appendix have been added by my modifications. Michael Chase provided a platform-independent version of the pack and unpack programs described in the section called “Unpacking and Packing OpenDocument files”.
Since this is a work in progress, I also want to thank all the people who are taking the time to read and review it and send their comments. Special thanks to Valden Longhurst, who found a multitude of typographical and grammatical oddities.
Copyright (c) 2005 O’Reilly & Associates, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License".