Data Structures and Algorithms with Object-Oriented Design Patterns in C#
next up previous contents index

AVL Search Trees

 

The problem with binary search trees is that while the average running times for search, insertion, and withdrawal operations are all tex2html_wrap_inline59121, any one operation is still O(n) in the worst case. This is so because we cannot say anything in general about the shape of the tree.

For example, consider the two binary search trees shown Figure gif. Both trees contain the same set of keys. The tree tex2html_wrap_inline62569 is obtained by starting with an empty tree and inserting the keys in the following order

displaymath63961

The tree tex2html_wrap_inline62571 is obtained by starting with an empty tree and inserting the keys in this order

displaymath63962

Clearly, tex2html_wrap_inline62571 is a better tree search tree than tex2html_wrap_inline62569. In fact, since tex2html_wrap_inline62571 is a perfect binary tree , its height is tex2html_wrap_inline63987. Therefore, all three operations, search, insertion, and withdrawal, have the same worst case asymptotic running time, tex2html_wrap_inline59121.

The reason that tex2html_wrap_inline62571 is better than tex2html_wrap_inline62569 is that it is the more balanced tree. If we could ensure that the search trees we construct are balanced then the worst-case running time of search, insertion, and withdrawal, could be made logarithmic rather than linear. But under what conditions is a tree balanced?

If we say that a binary tree is balanced if the left and right subtrees of every node have the same height, then the only trees which are balanced are the perfect binary trees. A perfect binary tree of height h has exactly tex2html_wrap_inline62957 internal nodes. Therefore, it is only possible to create perfect trees with n nodes for tex2html_wrap_inline64001. Clearly, this is an unsuitable balance condition because it is not possible to create a balanced tree for every n.

What are the characteristics of a good balance condition ?

  1. A good balance condition ensures that the height of a tree with n nodes is tex2html_wrap_inline59121.
  2. A good balance condition can be maintained efficiently. That is, the additional work necessary to balance the tree when an item is inserted or deleted is O(1).
Adelson-Velskii and Landisgif were the first to propose the following balance condition and show that it has the desired characteristics.

Definition (AVL Balance Condition)  An empty binary tree is AVL balanced  . A non-empty binary tree, tex2html_wrap_inline62933, is AVL balanced if both tex2html_wrap_inline62929 and tex2html_wrap_inline62931 are AVL balanced and

displaymath63963

where tex2html_wrap_inline64017 is the height of tex2html_wrap_inline62929 and tex2html_wrap_inline64021 is the height of tex2html_wrap_inline62931.

Clearly, all perfect binary trees are AVL balanced. What is not so clear is that heights of all trees that satisfy the AVL balance condition are logarithmic in the number of internal nodes.

Theorem  The height, h, of an AVL balanced tree with n internal nodes satisfies

displaymath63964

extbfProof The lower bound follows directly from Theorem gif. It is in fact true for all binary trees regardless of whether they are AVL balanced.

To determine the upper bound, we turn the problem around and ask the question, what is the minimum number of internal nodes in an AVL balanced tree of height h?

Let tex2html_wrap_inline64031 represent an AVL balanced tree of height h which has the smallest possible number of internal nodes, say tex2html_wrap_inline64035. Clearly, tex2html_wrap_inline64031 must have at least one subtree of height h-1 and that subtree must be tex2html_wrap_inline64041. To remain AVL balanced, the other subtree can have height h-1 or h-2. Since we want the smallest number of internal nodes, it must be tex2html_wrap_inline64047. Therefore, the number of internal nodes in tex2html_wrap_inline64031 is tex2html_wrap_inline64051, where tex2html_wrap_inline64053.

Clearly, tex2html_wrap_inline62783 contains a single internal node, so tex2html_wrap_inline64057. Similarly, tex2html_wrap_inline62549 contains exactly two nodes, so tex2html_wrap_inline64061. Thus, tex2html_wrap_inline64035 is given by the recurrence

  equation19244

The remarkable thing about Equation gif is its similarity with the definition of Fibonacci numbers  (Equation gif). In fact, it can easily be shown by induction that

displaymath63965

for all tex2html_wrap_inline62845, where tex2html_wrap_inline64067 is the tex2html_wrap_inline61267 Fibonacci number.

Base Cases

eqnarray19258

Inductive Hypothesis Assume that tex2html_wrap_inline64071 for tex2html_wrap_inline62855. Then

eqnarray19262

Therefore, by induction on k, tex2html_wrap_inline64071, for all tex2html_wrap_inline62845.

According to Theorem gif, the Fibonacci numbers are given by

displaymath61577

where tex2html_wrap_inline59713 and tex2html_wrap_inline59715. Furthermore, since tex2html_wrap_inline64085, tex2html_wrap_inline64087.

Therefore,

eqnarray19282

This completes the proof of the upper bound.

So, we have shown that the AVL balance condition satisfies the first criterion of a good balance condition--the height of an AVL balanced tree with n internal nodes is tex2html_wrap_inline64091. What remains to be shown is that the balance condition can be efficiently maintained. To see that it can, we need to look at an implementation.




next up previous contents index

Bruno Copyright © 2001 by Bruno R. Preiss, P.Eng. All rights reserved.