There is a growing need to accurately simulate physical systems whose evolution depends on the transport of subatomic particles. It has long been recognized that the huge computational demands of the transport problem mean that practical solution times will be obtained only by the efficient utilization of parallel processing. For example, since estimates place the time devoted to particle transport in multi-physics simulations at 50-80% of total execution time, parallelizing deterministic particle transport calculations is an important problem in many applications targeted by the DOE's Accelerated Strategic Computing Initiative (ASCI). One common approach to deterministic particle transport calculations is the discrete-ordinates method, whose most time consuming step is the transport sweep which involves multiple sweeps through the spatial grid, one for each direction of particle travel. The efficient parallel implementation of the transport sweeps is the key to parallelizing the discrete-ordinates method.
The key contribution of this paper
is the first general model which can be used to predict the running time
of transport sweeps on orthogonal grids for any regular mapping of the
grid cells to processors. Our model, which accounts for machine
dependent parameters such as computation cost and communication latency,
can be used to analyze and compare the effects of various spatial
decompositions on the running time of the transport sweep. Insight
obtained from the model yields two significant contributions to the
theory of optimal transport sweeps on orthogonal grids. First, our model
provides a theoretical basis which explains why, and under what
circumstances, the column decomposition of the current standard KBA
algorithm is superior to the 'balanced' decomposition obtained by
classic domain decomposition techniques. Second, our model enables us to
identify a new decomposition, we call Hybrid, which proves to be almost
as good as, and sometimes superior to, the current standard KBA method.
Our analysis covers sweeps in two- and three-dimensional spatial domains,
and first considers sweeps in only one direction, and then sweeps
involving multiple simultaneous directions. We obtain expressions for
the completion time and discuss theoretical results.