Recall from Section 3.7.2 that a hypercube connects
each of  P
 tasks ( P
 a power of 2) to 
 other tasks
(Figure 3.16).  The template considered in this
chapter uses this communication structure in an SPMD fashion, with
each task executing Algorithm 11.1.  A local  state
variable is first set to be the supplied input data.  Computation then
proceeds in 
 steps.  In each step, each task first exchanges
its local  state with one of its neighbors in the hypercube and
then combines the  message received from the neighbor with 
state to generate a new  state.  The output of the computation
is the  state generated in the final step.
In Algorithm 11.1, the  XOR function denotes an
exclusive or operation and is used to identify neighbors.  (Exclusive
or is defined as follows:  0 XOR 0=0,  0 XOR 1=1,  1 XOR
0=1,  1 XOR 1=0.)  As noted in Section 3.7.2, the
hypercube has the property that the binary labels of two nodes that
are neighbors in the
 d
th dimension differ only in the  d
th place; hence, the
expression  myid XOR 
 yields the  i
th neighbor of node
 myid.
A particular parallel algorithm is defined by the operator OP used to combine state and message at each step in the template. In the following, we shall show how this template can be used as a basis for parallel vector reduction, matrix transposition, and sorting algorithms.
 
    
 
 
 
 
 
   © Copyright 1995 by Ian Foster