Logo Data Structures and Algorithms with Object-Oriented Design Patterns in C++
next up previous contents index

Linear Probing

The simplest collision resolution strategy in open addressing is called linear probing . In linear probing, the function c(i) is a linear function in i. I.e., it is of the form

displaymath63020

Property 1 requires that c(0)=0. Therefore, tex2html_wrap_inline63030 must be zero.

In order for tex2html_wrap_inline63032 to satisfy Property 2, tex2html_wrap_inline63034 and M must be relatively prime. If we know the M will always be a prime number, then any tex2html_wrap_inline63034 will do. On the other hand, if we cannot be certain that M is prime, then tex2html_wrap_inline63034 must be one. Therefore, linear probing sequence that is usually used is

displaymath63021

for tex2html_wrap_inline63046.

Figure gif illustrates an example of a scatter table using open addressing together with linear probing. For example, consider the string "åtta". This string hashes to array position tex2html_wrap_inline62904. The corresponding linear probing sequence begins at position tex2html_wrap_inline62904 and goes on to positions tex2html_wrap_inline62908, tex2html_wrap_inline62910,.... In this case, the search for the string "åtta" succeeds after three probes.

   figure13223
Figure: Scatter Table using Open Addressing and Linear Probing

To insert an item x into the scatter table, an empty cell is found by following the same probe sequence that would be used in a search for item x. Thus, linear probing finds an empty cell by doing a linear search beginning from array position h(x).

An unfortunate characteristic of linear probing arises from the fact that as the table fills, clusters of consecutive cells form and the time required for a search increases with the size of the cluster. Furthermore, when we attempt to insert an item in the table at a position which is already occupied, that item is ultimately inserted at the end of the cluster--thereby increasing its length. This by itself is not inherently a bad thing. After all, when using the chained approach, every insertion increase the length of some chain by one. However, whenever an insertion is made between two clusters that are separated by one unoccupied position, the two clusters become one, thereby potentially increasing the cluster length by an amount much greater than one--a bad thing! This phenomenon is called primary clustering .


next up previous contents index

Bruno Copyright © 1997 by Bruno R. Preiss, P.Eng. All rights reserved.