Work-efficient nested data-parallelism

Authors :: Jan F. Prins
Daniel W. Palmer
S. Westfold
Source :: Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation.
Publication Year :: 2002
Publisher :: IEEE Comput. Soc. Press, 2002.
Abstract: An apply-to-all construct is the key mechanism for expressing data-parallelism, but data-parallel programming languages like HPF and C* significantly restrict which operations can appear in the construct. Allowing arbitrary operations substantially simplifies the expression of irregular and nested data-parallel computations. The technique of flattening nested parallelism introduced by Blelloch, compiles data-parallel programs with unrestricted apply-to-all constructs into vector operations, and has achieved notable success, particularly with irregular data-parallel programs. However, these programs must be carefully constructed so that flattening them does not lead to suboptimal work complexity due to unnecessary replication in index operations. We present new flattening transformations that generate programs with correct work complexity. Because these transformations may introduce concurrent reads in parallel indexing, we developed a randomized indexing that reduces concurrent reads while maintaining work-efficiency. Experimental results show that the new rules and implementations significantly reduce memory usage and improve performance. >

Subjects :: Theoretical computer science
Computational complexity theory
Parallel processing (DSP implementation)
Computer science
Search engine indexing
Parallelism (grammar)
Concurrent computing
Construct (python library)
Parallel computing
Expression (mathematics)
Replication (computing)

Database :: OpenAIRE
Journal :: Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation
Accession number :: edsair.doi...........dac5f306ae893b5439a3c947ee8cd9e9
Full Text :: https://doi.org/10.1109/fmpc.1995.380449