Logo Data Structures and Algorithms with Object-Oriented Design Patterns in C++
next up previous contents index

Average Case Analysis

The previous section has shown that in the worst case, the running time to insert an object into a separately chained hash table is O(1), and the time to find or delete an object is O(n). But these bounds are no better than the same operations on plain lists! Why have we gone to all the trouble inventing hash tables?

The answer lies not in the worst-case performance, but in the average expected performance. Suppose we have a hash table of size M. Let there be exactly n items in the hash table. We call the quantity tex2html_wrap_inline62864 the load factor   . The load factor is simply the ratio of the number of items in the hash table to the array length.

Let tex2html_wrap_inline62818 be the number of items in the tex2html_wrap_inline58387 linked list, for tex2html_wrap_inline62822. The average length of a linked list is

eqnarray12006

The average length of a linked list is exactly the load factor!

If we are given the load factor tex2html_wrap_inline62866, we can determine the average running times for the various operations. The average running time of Insert is the same as its worst case time, O(1)--this result does not depend on tex2html_wrap_inline62866. On the other hand, the average running time for Withdraw does depend on tex2html_wrap_inline62866. It is tex2html_wrap_inline62882 since the time required to delete an item from a linked list of length tex2html_wrap_inline62866 is tex2html_wrap_inline62886.

To determine the average running time for the Find operation, we need to make an assumption about whether the item that is being sought is in the table. If the item is not found in the table, the search is said to be unsuccessful. The average running time for an unsuccessful search is

displaymath62850

On the other hand, if the search target is in the table, the search is said to be successful. The average number of comparisons needed to find an arbitrary item in a linked list of length tex2html_wrap_inline62866 is

displaymath62851

Thus, the average running time for a successful search is

displaymath62852

So, while any one search operation can be as bad as O(n), if we do a large number of random searches, we expect that the average running time will be tex2html_wrap_inline62886. In fact, if we have a sufficiently good hash function and a reasonable set of objects in the container, we can expect that those objects are distributed throughout the table. Therefore, any one search operation will not be very much worse than the worst case.

Finally, if we know how many objects will be inserted into the hash table a priori, then we can choose a table size M which is larger than the maximum number of items expected. By doing this, we can ensure that tex2html_wrap_inline62896. I.e., a linked list contains no more than one item on average. In this case, the average time for Withdraw is tex2html_wrap_inline62834 and for Find it is tex2html_wrap_inline62900.


next up previous contents index

Bruno Copyright © 1997 by Bruno R. Preiss, P.Eng. All rights reserved.