Cover Data Structures and Algorithms with Object-Oriented Design Patterns in Java
next up previous contents index

AVL Search Trees

 

The problem with binary search trees is that while the average running times for search, insertion, and withdrawal operations are all tex2html_wrap_inline58840, any one operation is still O(n) in the worst case. This is so because we cannot say anything in general about the shape of the tree.

For example, consider the two binary search trees shown Figure gif. Both trees contain the same set of keys. The tree tex2html_wrap_inline62302 is obtained by starting with an empty tree and inserting the keys in the following order

displaymath63694

The tree tex2html_wrap_inline62304 is obtained by starting with an empty tree and inserting the keys in this order

displaymath63695

Clearly, tex2html_wrap_inline62304 is a better tree search tree than tex2html_wrap_inline62302. In fact, since tex2html_wrap_inline62304 is a perfect binary tree , its height is tex2html_wrap_inline63720. Therefore, all three operations, search, insertion, and withdrawal, have the same worst case asymptotic running time, tex2html_wrap_inline58840.

The reason that tex2html_wrap_inline62304 is better than tex2html_wrap_inline62302 is that it is the more balanced tree. If we could ensure that the search trees we construct are balanced then the worst-case running time of search, insertion, and withdrawal, could be made logarithmic rather than linear. But under what conditions is a tree balanced?

If we say that a binary tree is balanced if the left and right subtrees of every node have the same height, then the only trees which are balanced are the perfect binary trees. A perfect binary tree of height h has exactly tex2html_wrap_inline62690 internal nodes. Therefore, it is only possible to create perfect trees with n nodes for tex2html_wrap_inline63734. Clearly, this is an unsuitable balance condition because it is not possible to create a balanced tree for every n.

What are the characteristics of a good balance condition ?

  1. A good balance condition ensures that the height of a tree with n nodes is tex2html_wrap_inline58840.
  2. A good balance condition can be maintained efficiently. That is, the additional work necessary to balance the tree when an item is inserted or deleted is O(1).
Adelson-Velskii and Landisgif were the first to propose the following balance condition and show that it has the desired characteristics.

Definition (AVL Balance Condition)  An empty binary tree is AVL balanced  . A non-empty binary tree, tex2html_wrap_inline62666, is AVL balanced if both tex2html_wrap_inline62662 and tex2html_wrap_inline62664 are AVL balanced and

displaymath63696

where tex2html_wrap_inline63750 is the height of tex2html_wrap_inline62662 and tex2html_wrap_inline63754 is the height of tex2html_wrap_inline62664.

Clearly, all perfect binary trees are AVL balanced. What is not so clear is that heights of all trees that satisfy the AVL balance condition are logarithmic in the number of internal nodes.

Theorem  The height, h, of an AVL balanced tree with n internal nodes satisfies

displaymath63697

extbfProof The lower bound follows directly from Theorem gif. It is in fact true for all binary trees regardless of whether they are AVL balanced.

To determine the upper bound, we turn the problem around and ask the question, what is the minimum number of internal nodes in an AVL balanced tree of height h?

Let tex2html_wrap_inline63764 represent an AVL balanced tree of height h which has the smallest possible number of internal nodes, say tex2html_wrap_inline63768. Clearly, tex2html_wrap_inline63764 must have at least one subtree of height h-1 and that subtree must be tex2html_wrap_inline63774. To remain AVL balanced, the other subtree can have height h-1 or h-2. Since we want the smallest number of internal nodes, it must be tex2html_wrap_inline63780. Therefore, the number of internal nodes in tex2html_wrap_inline63764 is tex2html_wrap_inline63784, where tex2html_wrap_inline63786.

Clearly, tex2html_wrap_inline62516 contains a single internal node, so tex2html_wrap_inline63790. Similarly, tex2html_wrap_inline62282 contains exactly two nodes, so tex2html_wrap_inline63794. Thus, tex2html_wrap_inline63768 is given by the recurrence

  equation19120

The remarkable thing about Equation gif is its similarity with the definition of Fibonacci numbers  (Equation gif). In fact, it can easily be shown by induction that

displaymath63698

for all tex2html_wrap_inline62578, where tex2html_wrap_inline63800 is the tex2html_wrap_inline61000 Fibonacci number.

Base Cases

eqnarray19134

Inductive Hypothesis Assume that tex2html_wrap_inline63804 for tex2html_wrap_inline62588. Then

eqnarray19138

Therefore, by induction on k, tex2html_wrap_inline63804, for all tex2html_wrap_inline62578.

According to Theorem gif, the Fibonacci numbers are given by

displaymath61310

where tex2html_wrap_inline59432 and tex2html_wrap_inline59434. Furthermore, since tex2html_wrap_inline63818, tex2html_wrap_inline63820.

Therefore,

eqnarray19158

This completes the proof of the upper bound.

So, we have shown that the AVL balance condition satisfies the first criterion of a good balance condition--the height of an AVL balanced tree with n internal nodes is tex2html_wrap_inline63824. What remains to be shown is that the balance condition can be efficiently maintained. To see that it can, we need to look at an implementation.




next up previous contents index

Bruno Copyright © 1998 by Bruno R. Preiss, P.Eng. All rights reserved.