Book HomeXML in a Nutshell

2.5. Entity References

The character data inside an element may not contain a raw unescaped opening angle bracket (<). This character is always interpreted as beginning a tag. If you need to use this character in your text, you can escape it using the &lt; entity reference. When a parser reads the document, it will replace the &lt; entity reference with the actual < character. However, it will not confuse &lt; with the start of a tag. For example:

<SCRIPT LANGUAGE="JavaScript">
  if (location.host.toLowerCase( ).indexOf("cafeconleche") &lt; 0) {
    location.href="http://www.cafeconleche.org/";
  }
</SCRIPT>

The character data inside an element may not contain a raw unescaped ampersand (&) either. This is always interpreted as beginning an entity or character reference. However, the ampersand may be escaped using the &amp; entity reference like this:

<publisher>O'Reilly &amp; Associates</publisher>

Entity references such as &amp; and &lt; are considered to be markup. When an application parses an XML document, it replaces this particular markup with the actual characters to which the entity reference refers.

XML predefines exactly five entity references. These are:

&lt;
The less-than sign; a.k.a. the opening angle bracket (<)

&amp;
The ampersand (&)

&gt;
The greater-than sign; a.k.a. the closing angle bracket (>)

&quot;
The straight, double quotation marks (")

&apos;
The apostrophe; a.k.a. the straight single quote (')

Only &lt; and &amp; must be used instead of the literal characters in element content. The others are optional. &quot; and &apos; are useful inside attribute values where a raw " or ' might be misconstrued as ending the attribute value. For example, this image tag uses the &apos; entity reference to fill in the apostrophe in O'Reilly:

<image source='oreilly_koala3.gif' width='122' height='66'
       alt='Powered by O&apos;Reilly Books'
/>

Although there's no possibility of an unescaped greater-than sign (>) being misinterpreted as closing a tag it wasn't meant to close, &gt; is allowed mostly for symmetry with &lt;.

In addition to the five predefined entity references, you can define others in the document type definition. We'll discuss how to do this in Chapter 3.



Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.