Book HomeXML in a Nutshell

Chapter 24. DOM Reference

Contents:

Object Hierarchy
Object Reference

The Document Object Model (DOM) is a language- and platform-independent object framework for manipulating structured documents (see Chapter 18 for additional information). The current W3C recommendation specifies what is called the Level 2 DOM. The full Level 2 DOM is designed to support editing of HTML documents, with several classes and methods specific to HTML document structures. This larger DOM is built on top of a smaller, but complete, subset called the Core DOM. Only the Core DOM is required to support editing of XML documents.

TIP: Other parts of DOM Level 2 may be useful for specific kinds of XML processing, particularly the Style, Traversal, and Range modules.

This reference section documents the Levels 1 and 2 Core DOM objects, using the language-neutral OMG IDL object descriptions. Included with each IDL description is the language-specific binding for the Java programming language. Level 2-only constructs are indicated using the 2 symbol after the given attribute or method name.

TIP: This chapter is based on the Document Object Model (DOM) Level 2 Core Specification, which was released on November 13, 2000. The latest version of this recommendation, along with any errata that have been reported, is available on the W3C DOM Activity's web site (http://www.w3.org/DOM/DOMTR).

The DOM structures a document as a hierarchy of Node objects. The Node interface is the base interface for every member of a DOM document tree. It exposes attributes common to every type of document object and provides a few simple methods to retrieve type-specific information without resorting to downcasting. This interface also exposes all methods used to query, insert, and remove objects from the document hierarchy. The Node interface makes it easier to build general- purpose tree-manipulation routines that are not dependent on specific-document element types.

Attr

The Attr interface represents the value assigned to an attribute of an XML element. Since the attributes NamedNodeList attribute of the Element interface is the only access to Attr objects within the DOM, the parentNode, previousSibling, and nextSibling attributes always return null. Although the Attr interface inherits the Node base interface, many basic Node methods are not applicable.

An XML element can acquire an attribute in several ways. An element has an attribute value if:

Though an Attr node is not officially part of the DOM document tree, it can be the parent of a value subtree. An Attr object can have EntityReference objects as children. The value attribute provides the expanded DOMString representation of this attribute. To determine if any entity replacements were made, it is necessary to check the Attr node for child nodes.

//Get the element's size attribute as an Attr object
Attr attrName = elem.getAttributeNode("size");

Attributes

The following attributes are defined for the Attr object:

CDATASection

The CDATASection interface contains the unparsed, unescaped data contained within CDATA blocks in an XML document. Though this interface inherits the Text interface, adjacent CDATASection blocks are not merged by the normalize( ) method of the Element interface.

Java example

// Open an XML source file
try {
    FileInputStream fis = new FileInputStream("phone_list.xml");
    StringBuffer sb = new StringBuffer( );
    // read the XML source file into memory
    int ch;
    while ((ch = fis.read( )) != -1) {
        sb.append((char)ch);
    }
    
    // now, create a CDATASection object to contain it within
    // an element of our document using the CDATA facility
    CDATASection ndCDATA = doc.createCDATASection(sb.toString( ));
} catch (IOException e) {
    ...

CDATASection is a pure subclass of the Text interface and has no attributes or methods of its own. See the Text interface section of this chapter for a list of applicable methods for accessing character data in nodes of this type.

CharacterData

The CharacterData interface is completely abstract, extending the basic Node interface only to support manipulation of character data. Every DOM object type that deals with text data inherits, directly or indirectly, from this interface.

This interface's string-handling facilities are similar to those found in most modern programming languages. Like C/C++ string-processing routines, all CharacterData routines are zero-based.

Java example

// Create a new, unattached Text node
Text ndText = doc.createTextNode("The truth is out there.");
// cast it to the CharacterData interface
CharacterData ndCD = (CharacterData)ndText;

Attributes

The following attributes are defined for CharacterData:

CharacterData (continued)

Methods

The following methods are defined for CharacterData:

Comment

This object contains the text of an XML comment (everything between the opening <!-- and closing -->). It inherits from CharacterData.

NOTE: The DOM specification does not require XML parsers to preserve the original document comments after the document is parsed. Some implementations strip comments as part of the parsing process.

Java example

// Create a comment
Comment ndComment = doc.createComment("Document was parsed by 
                    DOM utility.");

// and add it to the document
doc.appendChild(ndComment);
Document

The Document interface represents an entire, well-formed XML document. Once the Document object is created via the DOMImplementation interface, you can access every aspect of the underlying XML document through the various tree-navigation methods exposed by the Node interface, the parent of the Document interface.

In DOM documents, document elements cannot exist outside of a parent document. For this reason, the Document interface exposes several factory methods used to create new document elements.

Attributes

The following attributes are defined for the Document object:

Document (continued)

Methods

The following methods are defined for the Document object:

createAttribute: name

This function creates an Attr object with the given name. Attr nodes construct complex element attributes that can include EntityReference objects and text data.

Argument

name: DOMString
The name of the XML attribute.

Return value

The new Attr object.

Exception

INVALID_CHARACTER_ERR
Indicates that the name you passed to createAttribute( ) doesn't conform to a valid XML name. See Chapter 2 for the XML restrictions on name construction.

Java binding

public Attr createAttribute(String name) throws DOMException;

Java example

// Create an entity reference
EntityReference er = doc.createEntityReference("name_entity");
    
// must create an Attribute object to include an explicit
// entity reference
Attr attr = doc.createAttribute("name");
    
// append the entity reference
attr.appendChild(er);
createAttributeNS: namespaceURI, qualifiedName2

This method serves the same purpose as the createAttribute method, but includes support for XML namespaces. See Chapter 4 for more information about namespaces.

Arguments

namespaceURI: DOMString
The URI associated with the namespace prefix in the qualifiedName parameter.

qualifiedName: DOMString
The name of the attribute to instantiate; includes the namespace prefix associated with the namespace URI given in the namespaceURI parameter.

Return value

The new Attr object is returned with the following attribute values:

Attribute

Value

Node.nodeName

The complete, fully qualified name given in the qualifiedName parameter

Node.namespaceURI

The given namespace URI

Node.prefix

The namespace prefix, which is parsed from the qualifiedName parameter

Node.localName

The local part of the qualified name, located to the right of the : character

Attr.name

The qualifiedName

Exceptions

INVALID_CHARACTER_ERR
Indicates that the name passed to createAttributeNS( ) doesn't conform to a valid XML name. See Chapter 2 for the XML restrictions on name construction.

NAMESPACE_ERR
Raised if the qualifiedName is malformed or has a prefix but no namespaceURI, or if the reserved xml namespace prefix was used incorrectly.

Java binding

public Attr createAttributeNS(String namespaceURI, String qualifiedName)
               throws DOMException;
createComment: data

This returns a new Comment node containing the specified string. See the Comment object reference earlier in this chapter for special restrictions that apply to the contents of Comment nodes.

Argument

data: DOMString
The comment text.

Comment text restriction

The XML specification indicates that the -- characters must not appear in the comment text for compatibility reasons. Despite this warning, some DOM implementations don't flag comments containing double hyphens as syntax errors.

Java binding

public Comment createComment(String data);

Java example

// Create a timestamp comment
StringBuffer sb = new StringBuffer( );
Date dtNow = new Date( );

sb.append("\tModified " + dtNow.toString( ) + '\n');

Comment cmt = doc.createComment(sb.toString( ));
createDocumentFragment( )

This returns an empty DocumentFragment object. See the DocumentFragment reference later in this chapter for a discussion of a document fragment's uses and limitations.

Java binding

public DocumentFragment createDocumentFragment( );
createElement: tagName

This creates a new, empty Element node for use within the parent document. The element name is given as an argument to the method. The resulting Element node belongs to the parent Document object, but is not part of the document element hierarchy. See Node later in this chapter for more information about how the document hierarchy manipulation methods are used.

Argument

tagName: DOMString
The XML name used to create the new Element node. This name is assigned to the nodeName attribute of the resulting Element node.

Return value

The new Element object.

Exception

INVALID_CHARACTER_ERR
Indicates that the name you passed to createElement( ) doesn't conform to a valid XML name. See Chapter 2 for the XML restrictions on name construction.

Java binding

public Element createElement(String tagName) throws DOMException;

Java example

// Create the new my_tag Element
Element elOut = doc.createElement("my_tag");
createElementNS: namespaceURI, qualifiedName2

This method serves the same purpose as the createElement method, but includes support for XML namespaces. See Chapter 4 for more information about namespaces.

Arguments

namespaceURI: DOMString
The URI associated with the namespace prefix in the qualifiedName parameter.

qualifiedName: DOMString
The name of the element to instantiate, including the namespace prefix associated with the namespace URI given in the namespaceURI parameter.

Return value

The new Element object is returned with the following attribute values:

Attribute

Value

Node.nodeName

The complete, fully qualified name given in the qualifiedName parameter

Node.namespaceURI

The given namespace URI

Node.prefix

The namespace prefix, which is parsed from the qualifiedName parameter

Node.localName

The local part of the qualified name, located to the right of the : character

Element.tagName

The full element tag name, which is the same as the qualifiedName

Exceptions

INVALID_CHARACTER_ERR
Indicates that the name you passed to createElementNS( ) doesn't conform to a valid XML name. See Chapter 2 for the XML restrictions on name construction.

NAMESPACE_ERR
Raised if the qualifiedName is malformed, has a prefix but no namespaceURI, or if the reserved xml namespace prefix was used incorrectly.

Java binding

public Element createElementNS(String namespaceURI, 
                               String qualifiedName)
                  throws DOMException;
createEntityReference: name

This creates an EntityReference object.

Argument

name: DOMString
The name of the XML entity to be referenced. The name must match an XML entity declaration that is valid in the current document.

Exceptions

INVALID_CHARACTER_ERR
Indicates that the name you passed to createEntityReference( ) doesn't conform to a valid XML name. See Chapter 2 for the XML restrictions on name construction.

NOT_SUPPORTED_ERR
Generated if you attempted to create an entity reference using an HTML document.

Java binding

public EntityReference createEntityReference(String name)
                          throws DOMException;

Java example

// Create an entity reference
EntityReference er = doc.createEntityReference("name_entity");
getElementById: elementID2

This method returns the Element node with the given value for its ID attribute.

NOTE: It is important not to confuse attributes that have the name ID with ID attributes. ID attributes are attributes that were declared with the ID attribute type within the document type definition. See the Attribute List Declaration section in Chapter 20 for more information about ID attributes.

Argument

elementID: DOMString
The unique ID value for the desired element.

Return value

A single Element object that has the requested ID attribute or null, if no match is found.

Java binding

public Element getElementById(String elementId);
importNode: importedNode, deep2

This method's name is somewhat deceptive. It creates a copy of a Node object from another document that can be inserted within the current document's node hierarchy. Specifics of this copy operation vary, depending on the type of copied node.

Node type

Result

Effect of deep flag

ATTRIBUTE_NODE

Copies the source attribute and all its children. The ownerElement attribute is set to null, and the specified flag is set to true.

None.

DOCUMENT_FRAGMENT_NODE

Creates an empty DocumentFragment node.

Fully copies the children of the source DocumentFragment node.

DOCUMENT_NODE

Cannot be imported.

N/A.

DOCUMENT_TYPE_NODE

Cannot be imported.

N/A.

ELEMENT_NODE

Copies the attribute nodes with the specified flag set to the new element.

Recursively copies all the source element's children.

ENTITY_NODE

Copies the publicId, systemId, and notationName attributes.

Recursively copies all of the Entity node's children.

ENTITY_REFERENCE_NODE

Copies only the EntityReference node. Its value, if any, is taken from the DTD of the document doing the import.

None.

NOTATION_NODE

Imports the notation node, but since in Level 2 the DocumentType interface is read-only, it cannot be included in the target document.

None.

PROCESSING_INSTRUCTION_ NODE

Copies the target and data values.

None.

TEXT_NODE,
CDATA_SECTION_NODE, COMMENT_NODE

Copies the data and length attributes.

None.

The new (copied) node object is returned based on the arguments.

Arguments

importedNode: Node
The node duplicated for use in the current document hierarchy.

deep: boolean
Whether to copy the single node given or the entire subtree of its children. For details, see the previous table.

Exception

NOT_SUPPORTED_ERR
Thrown if an attempt is made to import an unsupported Node type, such as a Document node.

Java binding

public Node importNode(Node importedNode, boolean deep)
    throws DOMException;
DocumentFragment

The DocumentFragment is a lightweight container used to store XML document fragments temporarily. Since it has no properties or methods of its own, it can only provide the same functionality exposed by the Node object. It is intended to serve as a container for at least one well-formed XML subtree.

This object's most obvious application is in the case of clipboard or drag-and-drop operations in a visual editor. The user may elect to select several sub-trees that appear at the same level of the tree to be copied:

<xml_example>
    <caption><filename>sample.xml</filename> before DocumentFragment
     copy operation</caption>
    <document>
        <parent>
            <child_1></child_1>
            <child_2></child_2>
        </parent>
        <parent>
        </parent>
    </document>
</xml_example>

If the user decides to copy the two child nodes to the clipboard, the DOM application would:

Then, when the user decides to paste the copied nodes to a new location, the new DocumentFragment node is passed to this target node's appendChild( ) method. During the copy operation, the DocumentFragment node itself is ignored, and only the children are attached to the target node.

<xml_example>
    <caption><filename>sample.xml</filename> after DocumentFragment copy 
     operation</caption>
    <document>
        <parent>
            <child_1></child_1>
            <child_2></child_2>
        </parent>
        <parent>
            <child_1></child_1>
            <child_2></child_2>
        </parent>
    </document>
</xml_example>

Java example

// Create a Document Fragment object
DocumentFragment dfNorm = doc.createDocumentFragment( );
DocumentType

The Document interface includes a single attribute, docType, that points either to a description of the DTD for the current document or to null if none exists.

Java example

// get document type information
    DocumentType dtDoc = doc.getDoctype( );

Attributes

The DocumentType object contains the following attributes:

DOMException

For languages and runtime platforms that support them, structured exceptions provide a way to separate the code that deals with abnormal or unexpected problems from the normal flow of execution. For languages that don't support exceptions, such as ECMAScript or Perl, these conditions are reported to your program as error codes from the method that recognized the condition.

The ExceptionCode is an integer value that indicates what type of exception was detected. The following ExceptionCodes are defined, with unused numeric codes reserved for future use by the W3C:

DOMImplementation

The DOMImplementation interface provides global information about the DOM implementation you currently use. The only way to obtain a reference to the DOMImplementation interface is through the getImplementation( ) method of the Document object.

Java example

// Check for DOM Level 1 support
DOMImplementation di = doc.getImplementation( );
// make sure that DOM Level 1 XML is supported
if (!di.hasFeature("XML", "1.0")) {
    return null;
}

Methods

The DOMImplementation object defines the following methods:

createDocument: namespaceURI, qualifiedName, doctype2

Creates a new, empty Document object with the given document type. It also creates the single, top-level document element using the given qualified name and namespace URI.

Arguments

namespaceURI: DOMString
The namespace URI used to create the top-level document element. Can be null if no namespace is used.

qualifiedName: DOMString
The namespace-aware qualified name of the top-level document element to be created. The prefix given in this parameter is associated with the namespace URI given in the namespaceURI parameter.

doctype: DOMString
The document type definition object to be associated with the new document. If this parameter is not null, the DocumentType node's ownerDocument attribute is set to point to the new document object.

Exceptions

INVALID_CHARACTER_ERR
Indicates that the qualifiedName parameter has a malformed XML identifier.

NAMESPACE_ERR
Raised if an inconsistency exists between the values given for the namespaceURI and the qualifiedName parameters. Passing in a qualified name with a namespace prefix and not passing in a namespace URI is illegal. This can also be generated if a reserved namespace prefix, such as "xml", is given with an incorrect namespace URI.

WRONG_DOCUMENT_ERR
Raised if the DocumentType node passed in the doctype parameter is already associated with another document object. New DocumentType objects must be created using the new createDocumentType method of the DOMImplementation interface.

Java binding

public Document createDocument(String namespaceURI,
    String qualifiedName, DocumentType doctype) throws DOMException;
Element

The Element object type provides access to the XML document's structure and data. Every XML element is translated into a single Element node. The document's root element is accessible through the documentElement property of the Document object. From this node, it is possible to re-create the full structure of the original XML document by traversing the element tree.

Java example

// Get the XML document's root element
Element elem = doc.getDocumentElement( );

This interface extends the basic Node interface to allow access to the XML attributes of the document element. Two sets of methods allow access to attribute values, either as Attr object trees or as simple DOMStrings.

Attribute

The Element object defines one attribute that contains the XML tag name:

Element (continued)

Methods

The following methods are defined for this object:

getAttribute: name

Returns the attribute specified by the name parameter as a DOMString. See the getAttributeNode:name for a complete explanation of how an attribute value is determined. This returns an empty string if no attribute is set and if no default attribute value was specified in the DTD.

Java binding

public String getAttribute(String name);

Java example

// Check for the name attribute
Element elem = doc.getDocumentElement( );

if (elem.getAttribute("name") == "") {
    System.out.println("warning: " + elem.getTagName( ) +
                   " element: no name attribute");
}
Entity

The Entity object represents a given general XML entity's replacement value. Depending on whether a given DOM implementation is validating or nonvalidating and whether it chooses to expand entity references inline during parsing, Entity objects may not be available to the DOM user.

Java example

// Locate the my_entity entity declaration
Entity ndEnt = (Entity)doc.getDoctype().getEntities( ).
    getNamedItem("my_entity");

Attributes

The following read-only attributes are defined for the Entity object:

EntityReference

EntityReference nodes appear within the document hierarchy wherever an XML general entity reference is embedded within the source document. Depending on the DOM implementation, a corresponding Entity object may exist in the entities collection of the docType attribute of the Document object. If such an entity exists, then the child nodes of both the Entity and EntityReference represent the replacement text associated with the given entity.

Java example

// Create a new entity reference
EntityReference ndER = doc.createEntityReference("my_entity");
NamedNodeMap

The NamedNodeMap interface provides a mechanism used to retrieve Node objects from a collection by name. Though this interface exposes the same methods and attributes as the NodeList class, they are not related. While it is possible to enumerate the nodes in a NamedNodeMap using the item( ) method and length attribute, the nodes are not guaranteed to be in any particular order.

Java example

// Get an element's attributes
NamedNodeMap nnm = elem.getAttributes( );

Attribute

The NamedNodeMap defines one attribute:

NamedNodeMap (continued)

Methods

The following methods are defined for the NamedNodeMap object:

setNamedItem: arg

Inserts the given Node object into the list, using its nodeName attribute. Since many DOM node types expose the same, hardcoded value for this property, storing only one of them in a single NamedNodeMap is possible. Each subsequent insertion overwrites the previous node entry. See the nodeName: DOMString topic for a discussion of these special name values.

This method returns a reference to the Node object that the new node replaces. If no nodes with the same nodeName value are currently in the map, this method returns null.

Argument

arg: Node
The Node object to be stored in the map. The value of the nodeName property is used as the lookup key. A node with the same nodeName value as the new node is replaced with the node referenced by arg.

Exceptions

WRONG_DOCUMENT_ERR
Raised if a document different than the creator of the target NamedNodeMap created the arg node.

NO_MODIFICATION_ALLOWED_ERR
Raised if the NamedNodeMap is read-only.

INUSE_ATTRIBUTE_ERR
Raised if the arg node is an Attr node that is already in use by another element's attributes map.

Java binding

public Node setNamedItem(Node arg) throws DOMException;

Java example

// Check to see if an ID attribute exists
// in this map, and add it if necessary
if (nnm.getNamedItem("id") == null) {
    // get the document
    Document doc = elem.getOwnerDocument( );
    // create a new attribute Node
    Attr attrID = doc.createAttribute("id");

    // set the attribute value
    attrID.appendChild(doc.createTextNode(makeUniqueID(elem)));

    // ... and add it to the NamedNodeMap
    nnm.setNamedItem(attrID);
}
Node

The Node interface is the base interface for every member of a DOM document tree. It exposes attributes common to every type of document object and provides simple methods to retrieve type-specific information without resorting to downcasting. For instance, the attributes list provides access to the Element object's attributes, but it would have no meaning for a ProcessingInstruction node. (Extracting pseudoattributes from a processing instruction requires your application to parse the contents of the processing instruction.)

This interface also exposes all methods for querying, inserting, and removing objects from the document hierarchy. The Node interface makes it easier to build general-purpose tree-manipulation routines that are not dependent on specific document element types.

Attributes

The following attributes provide information about where the Node object is located within the document tree. These attributes are read-only. Additional methods allow the insertion and removal of nodes from the document tree.

Node (continued)

Methods

The following methods are defined for Node interface objects:

insertBefore: newchild, refchild

Inserts the Node object newchild into the child list of the parent node that invokes it. The refchild parameter allows you to specify where to insert the new node in the list. If refchild is null, the new node is inserted at the end of the child list. (This behavior is the same as appendChild.) If it is not null, the new node is inserted into the list in front of the specified node. If the newchild node is already part of the document tree, it is unlinked before it is inserted in its new position. Also, if the newchild node references a DocumentFragment object, each of its children are inserted, in order, before the refchild node. A reference to the newchild node is returned.

Arguments

newchild: Node
The new node to insert.

refchild: Node
The node that follows the new node in the child list, or null, if the new node is inserted at the end of the child list.

Exceptions

HIERARCHY_REQUEST_ERR
Raised if the insert operation would violate at least one document structure rule. For instance, the node doesn't allow children or doesn't allow children of the newchild node type. This exception is also raised if the operation creates a circular reference (i.e., it tries to insert a node's parent as a node's child).

WRONG_DOCUMENT_ERR
Raised if the newchild node was created in a document different than that of the new parent node.

NO_MODIFICATION_ALLOWED_ERR
Raised if the new parent node is read-only.

NOT_FOUND_ERR
Raised if the node pointed to by refchild is not a child of the node performing the insert.

Java binding

public Node insertBefore(Node newChild, Node refChild)
               throws DOMException;

Java example

// Insert a new node at the head of the child list of a parent node
ndParent.insertBefore(ndNew, ndParent.getFirstChild( ));
isSupported: feature, version2

Checks to see if a particular DOM feature is available for this implementation. For more information about the feature names, see the hasFeature: feature, version method of the DOMImplementation object earlier in this chapter. This method returns true if the feature is available, false if it is not.

Arguments

feature: DOMString
The name of the feature to test for. See detail of the hasFeature: feature, version method of the DOMImplementation object for a list of this parameter's valid values.

version: DOMString
The version number of the feature to test. For DOM Level 2, Version 1, this string should be 2.0. If the version is not specified, this method tests for any version of the feature.

Java binding

public boolean supports(String feature, String version);
NodeList

The NodeList interface allows DOM classes to expose an ordered collection of nodes. A NodeList represents a read-only, zero-based array of Node objects. Since no mechanism exists for creating, adding, or removing nodes from a NodeList, DOM users cannot use this class as a general-purpose utility class.

Java example

// List the text contents of an element
NodeList nlChildren = elem.getChildNodes( );
Node ndChild;

for (int iNode = 0; iNode < nlChildren.getLength( ); iNode++) {
    ndChild = nlChildren.item(iNode);

    if (ndChild.getNodeType( ) == Node.TEXT_NODE) {
        System.out.println(ndChild.getNodeValue( ));
    }
}

Attributes

The NodeList interface defines one attribute:

NodeList (continued)

Methods

The NodeList interface defines one method:

ProcessingInstruction

This interface provides access to the contents of an XML processing instruction. Processing instructions provide a mechanism for embedding commands to an XML processing application that is in line with the XML content.

Java example

// Add an application-specific processing instruction
ProcessingInstruction pi = doc.createProcessingInstruction("my_app",
        "action=\"save\"");

Attributes

The interface defines two attributes:

Text

Text nodes contain the nonmarkup character data contained within the XML document. After the XML document is parsed, exactly one Text node exists for each uninterrupted block of nonmarkup text:

<text_node>This is text.</text_node>

Method

The following method is defined for the Text interface:

24.1. Object Hierarchy

The following table shows the DOM object hierarchy:

Object

Permitted child objects

Document

Element (one is the maximum)

 

ProcessingInstruction

 

Comment

 

DocumentType (one is the maximum)

DocumentFragment
Element
ProcessingInstruction
Comment
Text
CDATASection
EntityReference
DocumentType

None (leaf node)

EntityReference
Element
ProcessingInstruction
Comment
Text
CDATASection
EntityReference
Element
Element
Text
Comment
ProcessingInstruction
CDATASection
EntityReference
Attr
Text
EntityReference
ProcessingInstruction 

None (leaf node)

Comment

None (leaf node)

Text

None (leaf node)

CDATASection

None (leaf node)

Entity
Element
ProcessingInstruction
Comment
Text
CDATASection
EntityReference
Notation

None (leaf node)



Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.