Back to Search
Start Over
Efficient Parallel Sort on AVX-512-Based Multi-Core and Many-Core Architectures
- Source :
- HPCC/SmartCity/DSS
- Publication Year :
- 2019
- Publisher :
- IEEE, 2019.
-
Abstract
- Sorting kernels are a fundamental part of numerous applications. The performance of sorting implementations is usually limited by a variety of factors such as computing power, memory bandwidth, and branch mispredictions. In this paper we propose an efficient hybrid sorting method which takes advantage of wide vector registers and the high bandwidth memory of modern AVX-512-based multi-core and many-core processors. Our approach employs a combination of vectorized bitonic sorting and load-balanced multi-threaded merging. Thread-level and data-level parallelism are used to exploit both compute power and memory bandwidth. Our single-threaded implementation is ~30x faster than qsort in the C standard library and ~10x faster than C++'s std::sort. Compared with the Intel Performance Primitives (IPP) library which is one of the most efficient CPU-based radix sort implementation, we obtain a speedup of 1.3 to 2.6. Furthermore, we achieve a peak performance of sorting 1.14 billion floats per second on a Xeon Phi 7210 processor. Moreover, we show the extensibility of our vectorized kernels to processing units with a varying of vector lanes.
- Subjects :
- 020203 distributed computing
Bitonic sorter
Speedup
Computer science
Radix sort
Sorting
Memory bandwidth
02 engineering and technology
Parallel computing
Bitonic sorting
020202 computer hardware & architecture
0202 electrical engineering, electronic engineering, information engineering
sort
qsort
Merge sort
Branch misprediction
Xeon Phi
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)
- Accession number :
- edsair.doi...........1742c45ff537a0f8de63b67036874804