The final pieces of XML we cover are XPointer and XLink. These are separate standards in the XML family dedicated to working with XML links. Before we delve into them, however, we should warn you that the standards described here are not final as of publication time.
It's important to remember that an XML link is only an assertion of a relationship between pieces of documents; how the link is actually presented to a user depends on a number of factors, including the application processing the XML document.
To create a link, we must first have a labeling scheme for XML elements. One way to do this is to assign an identifier to specific elements we want to reference using an ID attribute:
<paragraph id="attack"> Suddenly the skies were filled with aircraft. </paragraph>
You can think of IDs in XML documents as street addresses: they provide a unique identifier for an element within a document. However, just as there might be an identical address in a different city, an element in a different document might have the same ID. Consequently, you can tie together an ID with the document's URI, as shown here:
http://www.oreilly.com/documents/story.xml#attack
The combination of a document's URI and an element's ID should uniquely identify that element throughout the universe. Remember that an ID attribute does not need to be named id, as shown in the first example. You can name it anything you want, as long as you define it as an XML ID in the document's DTD. (However, using id is preferred in the event that the XML processor does not read the DTD.)
Should you give an ID to every element in your documents? No. Odds are that most elements will never be referenced. It's best to place IDs on items that a reader would want to refer to later, such as chapter and section divisions, as well as important items, such as term definitions.
The easiest way to refer to an ID attribute is with an ID reference, or IDREF. Consider this example:
<?xml version="1.0" standalone="yes"?> <DOCTYPE document [ <!ELEMENT document (employee*)> <!ELEMENT employee (#PCDATA)> <!ATTLIST employee empnumber ID #REQUIRED> <!ATTLIST employee boss IDREF #IMPLIED> ]> <employee emplabel="emp123">Jay</employee> <employee emplabel="emp124">Kay</employee> <employee emplabel="emp125" boss="emp123">Frank</employee> <employee emplabel="emp126" boss="emp124">Hank</employee>
As with ID attributes, an IDREF is typically declared in the DTD. However, if you're in an environment where the processor might not read the DTD, you should call your ID references IDREF.
The chief benefit of using an IDREF is that a validating parser can ensure that every one points to an actual element; unlike other forms of linking, an IDREF is guaranteed to refer to something within the current document.
As we mentioned earlier, the IDREF only asserts a relationship of some sort; the style sheet and the browser will determine what is to be done with it. If the referring element has some content, it might become a link to the target. But if the referring element is empty, the style sheet might instruct the browser to perform some other action.
As for the linking behavior, remember that in HTML a link can point to an entire document (which the browser will download and display, positioned at the top) or to a specific location in a document (which the browser will display, usually positioned with that point at the top of the screen). However, linking changes drastically in XML. What does it mean to have a link to an entire element, which might be a paragraph (or smaller) or an entire group of chapters? The XML application attempts some kind of guess, but the display is best controlled by the style sheet. For now, it's best to simply make a link as meaningful as you can.
XPointer is designed to resolve the problem of locating an element or range of elements in an XML document. It is possible to do this in HTML if the element is referenced by an <a name="name"> tag. Here, a link is made for the section of the document using the <a href="url#name"> tag.
As we saw earlier, XML has this type of functionality through its unique identifiers. It is possible to locate an element with an identifier using a link such as the following:
document.xml#identifier
where identifier is a valid XPointer fragment identifier. However, this form is a simplification that is tolerated for compatibility with previous versions. The most common syntax for an XPointer fragment identifier is:
document.xml#xpointer(xpath)
Here xpath is an expression consistent with the XPath specification. It is the right thing to do in this case because it can be used to locate a node-set within a document. The link document.xml#identifier can be rewritten as:
document.xml#xpointer(id("identifier"))
There is a third possible form made up of a whole number separated by slashes. Each whole number selects an nth child from its predecessor in the expression.
Several fragment identifiers can be combined by placing them one after the other. For example:
document.xml#xpointer(...)xpointer(...)...
The application must evaluate the fragments, from left to right, and use the first valid fragment. This functionality is useful for two reasons:
It offers several solutions, the first of which is based on suppositions that may prove to be false (and produce an error). For example, we can try to locate a fragment in a document using an identifier, then (if no ID was defined) using the attribute value with the name id. We would write the fragment:
xpointer(id("conclusion"))xpointer(//*[@id=conclusion])
It also allows for future specifications. If an XPointer application encounters an expression that does not begin with xpointer, it will simply ignore it and move on to the next expression.
As we mentioned earlier, the XPointer application is responsible for link rendering, but it is also responsible for error handling. If the link's URL is wrong or if the fragment identifier is not valid, it is up to the application to manage the situation (by displaying an error message, for example).
Earlier we showed you how to locate an XML node within a document. XPointer goes even further by defining the point, range, and position (location) types:
Equipped with these new datatypes, XPointer can set out to locate a resource in an XML document.
A range is defined using the to operator. This operator is enclosed in starting points (to the left) and ending points (to the right). The second point is calculated using the first point as a reference. For example, to make a range from the beginning of the first paragraph to the end of the last paragraph in a section where the ID is XPointer, you would write:
xpointer(id("XPointer")/para[1] to id("XPointer")/para[last( )])
or:
xpointer(id("XPointer")/para[1] to following-sibling::para[last( )])
A range defined this way may be compared with the selection a user can make in a document with a mouse.
Naturally, XPointer also has functions to manipulate points and ranges. The available functions are:
string-range(//chapter[@title=XPointer], "XML")
To index the word XML by pointing to the first occurrence of the word in an element such as <para>, use the following expression:
string-range(//para, "XML")[1]
This function takes two other optional arguments. The third argument, offset, is a number that indicates the first character to be included in the result range offset from the beginning of the string searched for. The fourth argument, length, gives the length of the result range. By default, offset has a value of 1, thus the result range begins before the first character in the string. length has a default value such that the result range covers the entire string searched.
Now that we know about XPointer, let's take a look at some inline links:
<?xml version="1.0"?> <simpledoc xmlns:xlink="http://www.w3.org/1999/xlink"> <title>An XLink Demonstration</title> <section id="target-section"> <para>This is a paragraph in the first section.</para> <para>More information about XLink can be found at <reference xlink:type="simple" xlink:href="http://www.w3.org"> the W3C </reference>. </para> </section> <section id="origin-section"> <para> This is a paragraph in the second section. </para> <para> You should go read <reference xlink:type="simple" xlink:href="#target-section"> the first section </reference> first. </para> </section> </simpledoc>
The first link states that the text the W3C is linked to the URL http://www.w3.org. How does the browser know? Simple. An HTML browser knows that every <a> element is a link because the browser has to handle only one document type. In XML, you can make up your own element type names, so the browser needs some way of identifying links.
XLink provides the &xlink:type; attribute for link identification. A browser knows it has found a simple link when any element sets the &xlink:type; attribute to a value of simple. A simple link is like a link in HTML—one-way and beginning at the point in the document where it occurs. (In fact, HTML links can be recast as XLinks with minimal effort.) In other words, the content of the link element can be selected for traversal at the other end. Returning to the source document is left to the browser.
Once an XLink processor has found a simple link, it looks for other attributes that it knows:
This attribute must be specified, since without it, the link is meaningless. It is an error not to include it.
You do not need to give a value for this attribute. Remember that a link primarily asserts a relationship between data; behavior is best left to a style sheet. So unless the behavior is paramount (as it might be in some cases of embed), it is best not to use this attribute.
XLink has much more to offer, including links to multiple documents and links between disparate documents (where the XML document creating the links does not even contain any links).
An XLink application recognizes extended links by the presence of an &xlink:type="extended"; attribute that distinguishes it from a simple link (such as those used in HTML). An extended link may have semantic attributes (&xlink:role; and &xlink:title;) that function just as they do for a simple link.
In addition, an extended link may be one of four types as defined by its xlink:type="type" attribute:
Consider this example of an extended link supplying an XML bibliography:
<biblio xlink:type="extended"> <text xlink:type="resource" xlink:role="text">XML Bibliography</text> <book xlink:type="locator" xlink:role="book" xlink:href="xmlgf.xml" xlink:title="XML Pocket Reference"/> <book xlink:type="locator" xlink:role="book" xlink:href="lxml.xml" xlink:title="Learning XML"/> <author xlink:type="locator" xlink:role="author" xlink:href="robert-eckstein.xml" xlink:title="Robert Eckstein"/> <author xlink:type="locator" xlink:role="author" xlink:href="erik-ray.xml" xlink:title="Erik Ray"/> <arc xlink:type="arc"/> </biblio>
The extended link will probably be represented graphically as a menu with an entry for each element, except for the last one (arc), which has no graphical representation. However, the graphical representation of the link is the application's responsibility. Let's look at the role of each of the elements.
Resource elements, which include the &xlink:type="resource"; attribute, define a local resource that participates in a link. An extended link that includes a resource is considered inline because the file in which it is found participates in a link. A link that has no resource is called out-of-line.
XLink applications use the following attributes:
Attribute |
Description |
---|---|
xlink:type |
resource (fixed value) |
xlink:role |
Role of this resource in the link (used by arcs) |
xlink:title |
Text used by the XLink application to represent this resource |
In our example, the <text> element supplies the text to be displayed to represent the link.
Locator elements have the &xlink:type="locator"; attribute and use a URI to point to a remote resource. XLink applications use the following locator attributes:
Attribute |
Description |
---|---|
xlink:type |
locator (fixed value) |
xlink:href |
URI of the resource pointed to |
xlink:role |
Role resource pointed to (used by arcs) |
xlink:title |
Text the XLink application uses to graphically represent the resource |
In our example, we use two kinds of locators: those with a role of book that point to documents describing publications, and those with a role of author that point to a biography. Here, the role is important because it tells the XLink application the potential traversals among resources.
Arc elements have the &xlink:type="arc"; attribute and determine the potential traversals among resources, as well as the behavior of the XLink application during such traversals. Arc elements may be represented as arrows in a diagram, linking resources that participate in an extended link.
XLink applications use the following arc attributes:
Attribute |
Description |
---|---|
xlink:type |
arc (fixed value) |
xlink:from |
Indicates the role of the resource of the originating arc |
xlink:to |
Indicates the role of the resource of the destination arc |
xlink:show |
new, replace, embed, other, or none: tells the XLink application how to display the resource to which the arc is pointing |
xlink:actuate |
onLoad, onRequest, other, or none: tells the XLink application the circumstances under which the traversal is made |
xlink:arcrole |
Role of the arc |
xlink:title |
Text that may be used to represent the arc |
The values of the &xlink:show; and &xlink:actuate; attributes have the same meaning as they do with simple links.
Let's go back to our example of the bibliography, where we could define the following arc:
<arc xlink:from="text" xlink:to="book" xlink:show="new" xlink:actuate="onRequest"/>
The arc creates a link from the text displayed by the navigator (a resource where the role is text) to the descriptive page from the book (remote resource where the role is book). The page must be displayed in a new window (&xlink:show="new";) when the user clicks the mouse button (&xlink:actuate="onRequest";).
To include the author's biography in the card for the book, we will define the following arc:
<arc xlink:from="book" xlink:to="author" xlink:show="embed" xlink:actuate="onLoad"/>
&xlink:show="embed"; indicates that the destination of the arc (the biography) must be included in the card for the book (origin of the arc) and that the destination must be included when the book page is loaded (&xlink:actuate="onLoad";).
Finally, we need to indicate that the absence of the &xlink:from; or &xlink:to; attribute indicates that the origin or destination of the arc corresponds to all the roles defined in the link. Thus, the arc in our example (<arc xlink:type="arc"/>) authorizes all the traversals possible among the resources of the extended link.
Elements with a type of <title> tell the XLink application the title of the extended link. This element is needed when you want titles to have markers (for example, to put the text in bold) or if you want to provide titles in multiple languages. A <title> element must have the &xlink:type="title"; attribute.
As there may be a large number of attributes for the elements participating in an extended link, we recommend using the default values in the DTD. This eliminates the need to include fixed-value attributes for an element.
For example, because the &xlink:type; attribute of the <biblio> element always has extended as the value, we could declare the <biblio> element in the DTD as follows:
<!ELEMENT biblio (text, book+, author+, arc+)> <!ATTLIST biblio xlink:type (extended) #FIXED "extended">
We would not need to indicate the type, and if we proceed the same way for the other elements in the extended link, we could write the following link:
<biblio> <text>XML Bibliography</text> <book xlink:href="xmlgf.xml" xlink:title="XML Pocket Reference"/> <book xlink:href="lxml.xml" xlink:title="Learning XML"/> <author xlink:href="robert-eckstein.xml" xlink:title="Robert Eckstein"/> <author xlink:href="erik-ray.xml" xlink:title="Erik Ray"/> <arc/> </biblio>
By limiting ourselves to the strict minimum (attributes where the value is fixed do not need to be written), we gain readability.
As indicated earlier, an extended link with no resource-type element (local resource) is described as being out-of-line. Therefore, this type of link is not defined in any files to which it points. It may be convenient to regroup extended links in XML files called linkbases.
This raises the question of the location of such XML files. If we have no way of finding the linkbases associated with a given file (not provided in the W3C specification), we must indicate the URI in one of the files participating in the link. This is possible with the &xlink:role; attribute with the value &xlink:extended-linkset;.
The XLink application recognizes the attribute and can look for the associated linkbase where the URI is indicated by the &xlink:href; attribute. For example, to link the linkbase of the URI linkbase.xml to an XML file, we could use an element with the following syntax:
<linkbase> <uri xlink:role="XLink:extended-linkset" xlink:href="linkbase.xml"/> </linkbase>
We can indicate as many linkbases in a file as we want. A linkbase can itself contain a reference to another linkbase. It is up to the XLink application to manage circular references and limit the depth of the search for linkbases.
XBase is a W3C specification currently in development. XBase can be used to change the base of URIs in an XML document (which, by default, is the document's directory). XLink processors take XBase into consideration in order to manage URIs, using the xml:base="URI" attribute as follows:
<base xml:base="http://www.oreilly.com/bdl/"/> <linkbase> <uri xlink:role="xlink:extended-linkset" xlink:href="linkbase.xml"/> </linkbase>
The linkbase.xml linkbase is searched for in the http://www.oreilly.com/bdl/ directory, not in the directory of the document where the request was made to load the linkbase.
Loading of the base continues in the nodes that descend from the node in which the base is defined (this is the same behavior as the &xml:lang; and &xml:space; attributes).
Copyright © 2003 O'Reilly & Associates. All rights reserved.