In a distributed memory multiprocessing system, the manner in which data and computations are distributed among the processors has a significant impact on the resultant performance. On account of the varying data reference patterns at different points of a program it may be more profitable to change the data distributions over the course of the program to adapt to these variations. This thesis proposes a framework to automatically select data distributions for different points of a program in the presence of procedure calls, at compile-time. Since modifying a distribution from one point to another involves interprocessor communication overheads called redistributionoverheads, it is critical to choose new distributions so that the resultant redistribution overheads are sufficiently offset by the computational gains contributed by the new distributions in subsequent phases of computation.
In our framework, each procedure in the program is first analyzed to identify points at which data distributions may be required to be modified. These points are then associated with a set of distributions that favour its computations. These distributions form the distribution search space for that point. Each distribution is assigned a relative estimate of its ability to parallelize the computations for which it was chosen. Interprocedural analysis techniques are used to propagate the data distribution information across procedures. A dynamic programming strategy is then used to select distributions from these search spaces for different points along the execution path of the program. The selection is done so that the resultant computational gains in the presence of redistribution losses is maximum. The final selected distributions determine the data redistributions required along each execution path in the program.
Experimental results obtained by testing the dynamic programming algorithm on some standard HPF programs indicate that it is possible to determine efficient dynamic data distributions for a program in the presence of procedure boundaries, at compile-time.