STAPL | Software & Systems | Parasol Laboratory
STAPL Parallel Algorithms (pAlgorithms)
STAPL pAlgorithm is the parallel counterpart for STL Algorithm. pAlgorithms operate with pContainers through pRanges.
The current provided pAlgorithms include 3 categories:
- Non-mutating algorithms
- p_for_each: applies a unary function object for each element in a pRange.
- p_count_if: finds the number of elements in a pRange that satisfy some predicate.
- Mutating algorithms
- p_generate: assigns the result of invoking the input function object that takes no arguments, to each element in a pRange.
- Sorting algorithms: STAPL currently provides four popular parallel algorithms for sorting. Each behaves differently when the architecture and input characteristics are varied, making it a good candidate for our adaptive algorithm selection framework in STAPL.
- Sample Sort: A two pass technique that first samples the data to create discrete buckets (one per processor), then the algorithm distributes the elements into these buckets. These buckets are then sorted independently.
- Radix Sort: implemented using parallel counting sort. The sort considers elements in chunks of r bits at a time from least to greatest. Processors count the number of times each value occurs in their local data and then the prefix sum of the counts is computed globally. Each processor inserts its elements appropriately into the global list. This approach is repeated over r bits at a time until all bits have been considered.
- Column Sort: a version of one of the first parallel sorts to prove a O(log n) upper bound on the parallel time. However, it requires four local sorts of data and four communications steps, two all-to-all and two one-to-one. Hence, although theoretically optimal, it is often outperformed by other parallel sorting algorithms. The STAPL implementation uses the pMatrix, one of the STAPL pContainers, as the base underlying data structure.
- Bitonic Sort: STAPL implements a bitonic merge sort. The algorithm first forms a bitonic sequence and then proceeds to finally sort the elements by merging the corresponding sequences.
- General Numeric Algorithms:
- p_accumulate: computes the sum of the input init value and all of the elements in a pRange.
- p_inner_product: calculates a generalized inner product of two pRanges. It can use default addition (operator+) and multiplication (operator*), or use two user-supplied function objects instead.
- First order linear Recurrence: computes user-supplied first-order linear recurrence on input pRange like: x[i] = x[i-1] + x[i], or more general form x[i] = a[i] * x[i-1] + x[i].
If you have questions about STAPL, please email stapl-support@tamu.edu