Data Structures and Algorithms with Object-Oriented Design Patterns in C#
next up previous contents index

Middle Square Method

In this section we consider a hashing method which avoids the use of division. Since integer division is usually slower than integer multiplication, by avoiding division we can potentially improve the running time of the hashing algorithm. We can avoid division by making use of the fact that a computer does finite-precision integer arithmetic. For example, all arithmetic is done modulo W where tex2html_wrap_inline61457 is a power of two such that w is the word size  of the computer.

The middle-square hashing method   works as follows. First, we assume that M is a power of two, say tex2html_wrap_inline61423 for some tex2html_wrap_inline58801. Then, to hash an integer x, we use the following hash function:

displaymath61449

Notice that since M and W are both powers of two, the ratio tex2html_wrap_inline61473 is also a power two. Therefore, in order to multiply the term tex2html_wrap_inline61475 by M/W we simply shift it to the right by w-k bits! In effect, we are extracting k bits from the middle of the square of the key--hence the name of the method.

The following code fragment illustrates the middle-square method of hashing:

public class MiddleSquareMethod
{
    private const int k = 10; // M==1024
    private const int w = 32;

    public static int H(int x)
	{ return (int)((uint)(x * x)) >> (w - k); }
}
Since x is an int, the product x * x is also an an int. In C#, an int represents a 32-bit quantity and the product of two ints is also a 32-bit quantity. The final result is obtained by shifting the product w-k bits to the right, where w is the number of bits in an integer. Note, we cast the product to a uint before the shift so as to cause the right-shift operator to insert zeroes on the left. Therefore, the result always falls between 0 and M-1.

The middle-square method does a pretty good job when the integer-valued keys are equiprobable. The middle-square method also has the characteristic that it scatters consecutive keys nicely. However, since the middle-square method only considers a subset of the bits in the middle of tex2html_wrap_inline61491, keys which have a large number of leading zeroes will collide. For example, consider the following set of keys:

displaymath61450

This set contains all keys x such that tex2html_wrap_inline61495. For all of these keys h(x)=0.

A similar line of reasoning applies for keys which have a large number of trailing zeroes. Let W be an even power of two. Consider the set of keys

displaymath61451

The least significant w/2 bits of the keys in this set are all zero. Therefore, the least significant w bits of of tex2html_wrap_inline61491 are also zero and as a result h(x)=0 for all such keys!


next up previous contents index

Bruno Copyright © 2001 by Bruno R. Preiss, P.Eng. All rights reserved.