Logo Data Structures and Algorithms with Object-Oriented Design Patterns in C++
next up previous contents index

Union by Size

While using collapsing find does mitigate the negative effects of poor trees, a better approach is to avoid creating bad trees in the first place. As shown in Figure gif, when we join to trees we have a choice--which node should we choose to be the root of the new tree. A simple, but effective choice is to attach the smaller tree under the root of the larger one. In this case, the smaller tree is the one which has fewer nodes. This is the so-called union-by-size  join algorithm. Program gif shows how this can be done.

   program29999
Program: PartitionAsForest Class Union-by-Size Join Member Function Definition

The implementation uses the count field of the Container class, from which PartitionTree is derived, to keep track of the number of items contained in the tree. (Since each node contains one item from the universal set, the number of items contained in a tree is equal to the number of nodes in that tree). The algorithm simply selects the tree with the largest number of nodes to become the root of the result and attaches the root of the smaller tree under that of the larger one. Clearly, the running time of the union-by-size version of Join is O(1).

The following theorem shows that when using the union-by-size join operation, the heights of the resulting trees grow logarithmically.

Theorem  Consider an initial partition P of the universe tex2html_wrap_inline67278 comprised of N sets of size 1. Let S be an element of the partition obtained from P after some sequence of union-by-size join operations, such that |S|=n for some tex2html_wrap_inline59533. Let T be the tree representing the set S. The height of tree T satisfies the inequality

displaymath67766

extbfProof (By induction).

Base Case Since a tree comprised of a single node has height zero, the theorem clearly holds for n=1.

Inductive Hypothesis Suppose the theorem holds for trees containing n nodes for tex2html_wrap_inline67796 for some tex2html_wrap_inline59577. Consider a union-by-size join operation that produces a tree containing k+1 nodes. Such a tree is obtained by joining a tree tex2html_wrap_inline67802 having tex2html_wrap_inline67804 nodes with another tree tex2html_wrap_inline67806 that has tex2html_wrap_inline67808 nodes, such that l+m=k+1.

Without loss of generality, suppose tex2html_wrap_inline67812. As a result, l is less than or equal to m. Therefore, the union-by-size algorithm will attach tex2html_wrap_inline67802 under the root of tex2html_wrap_inline67806. Let tex2html_wrap_inline67822 and tex2html_wrap_inline67824 be the heights of tex2html_wrap_inline67802 and tex2html_wrap_inline67828 respectively. The height of the resulting tree is tex2html_wrap_inline67830.

According to the inductive hypothesis, the height of tex2html_wrap_inline67806 is given by

eqnarray30020

Similarly, the quantity tex2html_wrap_inline67834 is bounded by

eqnarray30022

Therefore, the height of the tree containing k+1 nodes is no greater than tex2html_wrap_inline67838. By induction on k, the theorem holds for all values of tex2html_wrap_inline59533.

Note that Theorem gif and its proof does not require that we use the collapsing find algorithm of Section gif. I.e., the height of a tree containing n nodes is guaranteed to be tex2html_wrap_inline59891 when the simple find is used. Of course, there is nothing precluding the use of the collapsing find in conjunction with the union-by-size join routine. And doing so only makes things better.


next up previous contents index

Bruno Copyright © 1997 by Bruno R. Preiss, P.Eng. All rights reserved.