Book HomeXML in a Nutshell

3.5. External Parsed General Entities

The footer example is about at the limits of what you can comfortably fit in a DTD. In practice, web sites prefer to store repeated content like this in external files and load it into their pages using PHP, server-side includes, or some similar mechanism. XML supports this technique through external general entity references, though in this case the client, rather than the server, is responsible for integrating the different pieces of the document into a coherent whole.

An external parsed general entity reference is declared in the DTD using an ENTITY declaration. However, instead of the actual replacement text, the SYSTEM keyword and a URI to the replacement text is given. For example:

<!ENTITY footer SYSTEM "http://www.oreilly.com/boilerplate/footer.xml">

Of course, a relative URL will often be used instead. For example:

<!ENTITY footer SYSTEM "/boilerplate/footer.xml">

In either case when the general entity reference &footer; is seen in the character data of an element, the parser may replace it with the document found at http://www.oreilly.com/boilerplate/footer.xml. References to external parsed entities are not allowed in attribute values. Most of the time, this shouldn't be too big a hassle because attribute values tend to be small enough to be easily included in internal entities.

Notice we wrote that the parser may replace the entity reference with the document at the URL, not that it must. This is an area where parsers have some leeway in just how much of the XML specification they wish to implement. A validating parser must retrieve such an external entity. However, a nonvalidating parser may or may not choose to retrieve the entity.

Furthermore, not all text files can serve as external entities. In order to be loaded in by a general entity reference, the document must be potentially well-formed when inserted into an existing document. This does not mean the external entity itself must be well-formed. In particular, the external entity might not have a single root element. However, if such a root element were wrapped around the external entity, then the resulting document should be well-formed. This means, for example, that all elements that start inside the entity must finish inside the same entity. They cannot finish inside some other entity. Furthermore, the external entity does not have a prolog and, therefore, cannot have an XML declaration or a document type declaration.

3.5.1. Text Declarations

Instead of an XML declaration, an external entity may have a text declaration; this looks a lot like an XML declaration. The main difference is that in a text declaration the encoding declaration is required, while the version info is optional. Furthermore, there is no standalone declaration. The main purpose of the text declaration is to warn the parser if the external entity uses a different text encoding than the including document. For example, this is a common text declaration:

<?xml version="1.0" encoding="MacRoman"?>

However, you could also use this text declaration with no version attribute:

<?xml encoding="MacRoman"?>

Example 3-5 is a well-formed external entity that could be included from another document using an external general entity reference.

Example 3-5. An external parsed entity

<?xml encoding="ISO-8859-1"?>
<hr size="1" noshade="true"/>
<font CLASS="footer">
  <a href="index.html">O'Reilly Home</a> |
  <a href="sales/bookstores/">O'Reilly Bookstores</a> |
  <a href="order_new/">How to Order</a> |
  <a href="oreilly/contact.html">O'Reilly Contacts</a><br>
  <a href="http://international.oreilly.com/">International</a> |
  <a href="oreilly/about.html">About O'Reilly</a> |
  <a href="affiliates.html">Affiliated Companies</a>
</font>
<p>
  <font CLASS="copy">
    Copyright 2000, O'Reilly &amp; Associates, Inc.<br/>
    <a href="mailto:webmaster@oreilly.com">webmaster@oreilly.com</a>
  </font>
</p>


Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.