Book HomeJava and XML, 2nd Edition

6.4. DOM Level 3

Before closing the book on DOM and looking at common gotchas, I will spend a little time letting you know what's coming in DOM Level 3, which is underway right now. In fact, I expect this specification to be finalized early in 2002, not long from the time you are probably reading this book. The items I point out here aren't all of the changes and additions in DOM Level 3, but they are the ones that I think are of general interest to most DOM developers (that's you now, if you were wondering). Many of these are things that DOM programmers have been requesting for several years, so now you can look forward to them as well.

6.4.1. The XML Declaration

The first change in the DOM that I want to point out seems pretty trivial at first glance: exposure of the XML declaration. Remember those? Here's an example:

<?xml version="1.0" standalone="yes" encoding="UTF-8"?>

There are three important pieces of information here that are not currently available in DOM: the version, the state of the standalone attribute, and the specified encoding. Additionally, the DOM tree itself has an encoding; this may or may not match up to the XML encoding attribute. For example, the associated encoding for "UTF-8" in Java turns out to be "UTF8", and there should be a way to distinguish between the two. All of these problems are solved in DOM Level 3 by the addition of four attributes to the Document interface. These are version (a String), standalone (a boolean), encoding (another String), and actualEncoding (String again). The accessor and mutator methods to modify these attributes are pretty straightforward:

public String getVersion( );
public void setVersion(String version);

public boolean getStandalone( );
public void setStandalone(boolean standalone);

public String getEncoding( );
public void setEncoding(String encoding);

public String getActualEncoding( );
public void setActualEncoding(String actualEncoding);

Most importantly, you'll finally be able to access the information in the XML declaration. This is a real boon to those writing XML editors and the like that need this information. It also helps developers working with internationalization and XML, as they can ascertain a document's encoding (encoding), create a DOM tree with its encoding (actualEncoding), and then translate as needed.

6.4.2. Node Comparisons

In Levels 1 and 2 of DOM, the only way to compare two nodes is to do it manually. Developers end up writing utility methods that use instanceof to determine the type of Node, and then compare all the available method values to each other. In other words, it's a pain. DOM Level 3 offers several comparison methods that alleviate this pain. I'll give you the proposed signatures, and then tell you about each. They are all additions to the org.w3c.dom.Node interface, and look like this:

// See if the input Node is the same object as this Node
public boolean isSameNode(Node input);

// Tests for equality in structure (not object equality)
public boolean equalsNode(Node input, boolean deep);

/** Constants for document order */
public static final int DOCUMENT_ORDER_PRECEDING = 1;
public static final int DOCUMENT_ORDER_FOLLOWING = 2;
public static final int DOCUMENT_ORDER_SAME      = 3;
public static final int DOCUMENT_ORDER_UNORDERED = 4;

// Determine the document order of input in relation to this Node
public int compareDocumentOrder(Node input) throws DOMException;

/** Constants for tree position */
public static final int TREE_POSITION_PRECEDING  = 1;
public static final int TREE_POSITION_FOLLOWING  = 2;
public static final int TREE_POSITION_ANCESTOR   = 3;
public static final int TREE_POSITION_DESCENDANT = 4;
public static final int TREE_POSITION_SAME       = 5;
public static final int TREE_POSITION_UNORDERED  = 6;

// Determine the tree position of input in relation to this Node
public int compareTreePosition(Node input) throws DOMException;

The first method, isSameNode( ), allows for object comparison. This doesn't determine whether the two nodes have the same structure or data, but whether they are the same object in the JVM. The second method, equalsNode( ), is probably going to be more commonly used in your applications. It tests for Node equality in terms of data and type (obviously, an Attr will never be equal to a DocumentType). It provides a parameter, deep, to allow comparison of just the Node itself or of all its child Nodes as well.

The next two methods, compareDocumentOrder( ) and compareTreePosition( ), allow for relational positioning of the current Node and an input Node. For both, there are several constants defined to be used as return values. A node can be before the current one in the document, after it, in the same position, or unordered. The unordered value occurs when comparing an attribute to an element, or in any other case where the term "document order" has no contextual meaning. And finally, a DOMException occurs when the two nodes being queried are not in the same DOM Document object. The final new method, compareTreePosition( ), provides the same sort of comparison, but adds the ability to determine ancestry. Two additional constants, TREE_POSITION_ANCESTOR and TREE_POSITION_DESCENDANT, allow for this. The first denotes that the input Node is up the hierarchy from the reference Node (the one the method is invoked upon); the second indicates that the input Node is down the hierarchy from the reference Node.

With these four methods, you can isolate any DOM structure and determine how it relates to another. This addition to DOM Level 3 should serve you well, and you can count on using all of the comparison methods in your coding. Keep an eye on both the constant names and values, though, as they may change over the evolution of the specification.

6.4.3. Bootstrapping

The last addition in DOM Level 3 I want to cover is arguably the most important: the ability to bootstrap. I mentioned earlier that in creating DOM structures, you are forced to use vendor-specific code (unless you're using JAXP, which I'll cover in Chapter 9, "JAXP"). This is a bad thing, of course, as it knocks out vendor-independence. For the sake of discussion, I'll repeat a code fragment that creates a DOM Document object using a DOMImplementation here:

import org.w3c.dom.Document;
import org.w3c.dom.DOMImplementation;

import org.apache.xerces.dom.DOMImplementationImpl;

// Class declaration and other Java constructs

DOMImplementation domImpl = DOMImplementationImpl.getDOMImplementation( );
Document doc = domImpl.createDocument( );
// And so on...

The problem is that there is no way to get a DOMImplementation without importing and using a vendor's implementation class. The solution is to use a factory that provides DOMImplementation instances. Of course, the factory is actually providing a vendor's implementation of DOMImplementation (I know, I know, it's a bit confusing). Vendors can set system properties or provide their own versions of this factory so that it returns the implementation class they want. The resulting code to create DOM trees then looks like this:

import org.w3c.dom.Document;
import org.w3c.dom.DOMImplementation;
import org.w3c.dom.DOMImplementationFactory;

// Class declaration and other Java constructs

DOMImplementation domImpl = 
    DOMImplementationFactory.getDOMImplementation( );
Document doc = domImpl.createDocument( );
// And so on...

The class being added is DOMImplementationFactory, and should solve most of your vendor-independence issues once it's in place. Look for this as the flagship of DOM Level 3, as it's one of the most requested features for current levels of DOM.



Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.