Olinking in print output

You can use olink to form active hotlinks between PDF documents. In stylesheet versions prior to 1.66, the olink text would be generated, but the links would only be active if they were to a destination within the same document. Now an olink can open another PDF file, and, under the right circumstances, can scroll to the exact location in the external document.

Scrolling to the exact location means that the fragment identifier that the stylesheet adds to a PDF reference is resolved:

myotherbook.pdf#intro

This should open the PDF file named myotherbook.pdf and scroll to the point in the document where the intro id is located.

The use of fragment identifiers is currently somewhat limited because of the inconsistent behavior of XSL-FO processors and PDF browsers. Only the XEP and Antenna House XSL-FO processors add the correct ID information to the PDF document for it to work. Neither FOP nor Xml2PDF write ihe original ID values from the XSLT process to the PDF file. So although a link with a fragment identifier is properly formed, the ID value it points to does not exist in the destination PDF.

If you are using XEP, you may find that some external olinks work and others do not, even though they should. This could be because some of the id attributes from the source document were not copied to the PDF file. By default, XEP only writes id values that are internally referenced. There is a special XEP processing instruction that enables all id values, but it first appeared in version 1.68 of the DocBook stylesheets. If you are using a stylesheet version prior to that, you need to add this option to your XEP command:

-DDROP_UNUSED_DESTINATIONS=false

See your XEP documentation for more information.

Also some browsers don't properly interpret PDF fragment identifiers. Adobe Reader seems to work, but Adobe Reader used as a PDF plugin inside Internet Explorer does not. Another factor is whether the PDF file is accessed as a local file or from an HTTP server. In some cases, the hot link will open the other PDF file but won't scroll to the exact location. In other cases, the browser will complain that it cannot open the other document at all because it is misinterpreting the fragment identifier as part of the filename.

Because the fragment identifiers in PDF olinks can cause such problems, the FO stylesheet turns them off by default. Without the fragment identifiers, an olink to another PDF document will open the document and display its first page. You can turn fragment identifiers on if your situation permits by setting the stylesheet parameter insert.olink.pdf.frag to 1.

Setting up PDF olinking

Since the FO stylesheet now supports olinks, it can also be used to generate a document's olink data file (whose default name is target.db). In previous versions you had to use the HTML stylesheet to generate the data file. Now you can set the collect.xref.targets parameter when using the FO stylesheet.

You may need to maintain separate olink data files for HTML and FO processing. If you don't customize the text that is generated for cross references, then you can use one olink data file. Or if you customize the generated text but do it the same for both HTML and FO output, then you can still use one olink data file. But if your generated text differs between HTML and FO output, then you will need to maintain separate target data files. That's because those files store copies of the generated text for each target. Use the targets.filename parameter to change the default data filename from target.db to something else.

Regardless of whether you need to create separate target data files for each document, you will need to create separate master olink database files (identified by the target.database.document parameter) for HTML and FO output. That's because the baseuri attributes on the document element in the database must differ.

For HTML output, the baseuri would be either the HTML filename (for nonchunked output), or a directory name (for chunked output). For FO output, the baseuri must point to the PDF file. If you are using XEP, local file access requires using a file: protocol, so it should be file:myfile.pdf. For other XSL-FO processors the baseuri should be just myfile.pdf. If you are olinking to a PDF document on a web server, then the baseuri should be the URI of the PDF file, such as http://mysite.com/myfile.pdf.

If you are olinking among a collection of PDF document located in different directories, you can use the sitemap feature of the olink database to indicate their relative locations. Then the processor can compute a relative path between documents to form the link. See the section “Using a sitemap” for more information.

Linking between HTML and PDF documents

It is possible to form olinks between HTML and PDF documents. In the olink database file (identified by the target.database.document parameter), an HTML document can have an HTML baseuri, and another document in PDF form can have a PDF baseuri. But you can't do both formats for one document in the same database because each targetdoc attribute must be unique within each language. So when you set up the olink database file, you have to decide if you are targeting the HTML or PDF version of a given document.

Whether your links work at runtime will depend on your browser setup. An HTML browser needs to detect that the link is a PDF file, and load it into a PDF viewer. Going in the other direction, a PDF viewer needs to detect that the link is an HTML file, and pass the URI off to an HTML browser.

Page references in olinks

There are two situations for generating page numbers for olinks: internal and external olinks. Internal olinks are used within a document when its text is divided among multiple separate file modules. As described in the section “Modular cross referencing”, olinks are used to form cross references between modules so that each module can be validated. The stylesheet recognizes when an olink is to a location within the same document when the current.docid parameter is set and its value matches the targetdoc attribute in an olink. Such olinks will be treated as if they were internal xref links, and they will get a page reference if that feature is turned on for internal links.

For olinks to external documents, the situation is different. The olink mechanism includes partial support for page number references to external documents. That means you could refer someone to an actual page number in the other document. In the olink data set for document, the page number for each target can be stored in a page attribute on each div or obj element. If an olink style calls for a page number, and if the data is there, it will be output.

But there is currently no standard way of populating the page attribute in the olink data. Since page numbers are only available in the XSL-FO processor and not the stylesheet, some sort of postprocessing would be needed to extract the page numbers for each element. The collecting of page numbers is not part of the stylesheets at this time.