Array Subscript Calculations

Data Structures and Algorithms with Object-Oriented Design Patterns in C++

Array Subscript Calculations

The memory of a computer is essentially a one-dimensional array--the memory address is the array subscript. Therefore, the most natural way to implement a multi-dimensional array is to store its elements in a one-dimensional array. In order to do this, we need a mapping from the n subscript expressions used to access an element of the multi-dimensional array to the one subscript expression used to access the one-dimensional array. E.g., suppose we have a two-dimensional array of elements of type T, T a[2][3], the elements of which are to be stored in a one-dimensional array, T b[20]. Then we need to determine which element of b, say b[k], will be accessed given a reference of the form a[i][j]. I.e., we need the mapping f such that .

The mapping function determines the way in which the elements of the array are stored in memory. The most common way to represent an array is in row-major order , also known as lexicographic order . E.g., consider the 2D array T a[2][3]. The row-major layout of this array is shown in Figure .

Figure: Row-Major Order Layout of a 2D Array

In row-major layout, it is the right-most subscript expression (the column index) that increases the fastest. As a result, the elements of the rows of the matrix end up stored in contiguous memory locations. In Figure , the array is stored starting from address a. The first element of the first row is a address . The first element of the second row is at address , since there are 3 elements in each row.

We can now generalize this to an arbitrary n-dimensional array. Suppose we have an n-D array declared as T a[ ][ ][ ][ ]. Furthermore, let a be the starting address of the array. Then, the address of the element a[ ][ ][ ][ ] is given by

where

The running time required to calculate the address appears to be since the address is the sum of n terms and for each term we need to compute , which requires O(n) multiplications in the worst case. However, the address calculation can in fact be done in O(n) time using the following algorithm:

unsigned int product = 1;
T* address =  ;
for (int j = n; j >= 1; -j)

    address += product *  ;
    product *=  ;

This algorithm makes subtle use of the way that address arithmetic is done in C++. Since the variable address is of type T*, it is not necessary to scale the computation by sizeof(T). In C++ whenever an integer value is added to a pointer variable, it is automatically scaled by the compiler before the addition.