Book Home Perl for System AdministrationSearch this book

C.4. Leftovers

Here are three terms that appear throughout the XML literature and may stymie the XML beginner:

Attribute

The descriptions of an element that are part of the initial start tag. To reuse a previous example, in <img src="picture.jpg" />, src="picture.jpg" is an attribute for this element. There is some controversy in the XML world about when to use the contents of an element and when to use attributes. The best set of guidelines on this particular issue can be found at http://www.oasis-open.org/cover/elementsAndAttrs.html.

CDATA

The term CDATA (Character Data) is used in two contexts. Most of the time it refers to everything in an XML document that is not markup (tags, etc). The second context involves CDATA sections. A CDATA section is declared to indicate that an XML parser should leave that section of data alone even if it contains text that could be construed as markup.

PCDATA

Tim Bray's annotation of the XML specification (mentioned earlier) gives the following definition:

The string PCDATA itself stands for "Parsed Character Data." It is another inheritance from SGML; in this usage, "parsed" means that the XML processor will read this text looking for markup signaled by < and & characters.

You can think of this as data composed of CDATA and potentially some markup. Most XML data falls into this classification.

XML has a bit of a learning curve. This small tutorial should help you get started.



Library Navigation Links

Copyright © 2001 O'Reilly & Associates. All rights reserved.