Abstract
Maria Jesus Garzaran, Milos Prvulovic, Ye Zhang, Alin Jula, Hao Yu, Lawrence Rauchwerger, Josep Torrellas, "Architectural Support for Parallel Reductions in Scalable Shared-Memory Multiprocessors," In Proc. IEEE Int.Conf. on Parallel Architectures and Compilation Techniques (PACT), Barcelona, Spain, Sep 2001.
Proceedings(ps, pdf, abstract)
Reductions are important and time-consuming operations in many scientific codes. Effective parallelization of reductions is a critical transformation for loop parallelization, especially for sparse, dynamic applications. Unfortunately, conventional reduction parallelization algorithms are not scalable. In this paper, we present new architectural support that significantly speeds-up parallel reduction and makes it scalable in shared-memory multiprocessors. The required architectural changes are mostly confined to the directory controllers. Experimental results based on simulations show that the proposed support is very effective. While conventional software-only reduction parallelization delivers average speedups of only 2.7 for 16 processors, our scheme delivers average speedups of 7.6.