Back to Search
Start Over
Towards ultra-scale Branch-and-Bound using a high-productivity language
- Source :
- Future Generation Computer Systems, Future Generation Computer Systems, Elsevier, 2020, SI: On The Road to Exascale II: Advances on High Performance Computing and Simulations, 105, pp.196-209. ⟨10.1016/j.future.2019.11.011⟩, Future Generation Computer Systems, 2020, SI: On The Road to Exascale II: Advances on High Performance Computing and Simulations, 105, pp.196-209. ⟨10.1016/j.future.2019.11.011⟩
- Publication Year :
- 2020
- Publisher :
- HAL CCSD, 2020.
-
Abstract
- Due to the highly irregular nature and prohibitive execution times of Branch-and-Bound (B&B) algorithms applied to combinatorial optimization problems (COPs), their parallelization has received the se two last decades great attention. Indeed, significant efforts have been made to revisit the parallelization of B&B following the rapid evolution of high-performance computing technologies dealing with their associated scientific and technical challenges. However, these parallelization efforts have always been guided by the performance objective setting aside programming productivity. Nevertheless, this latter is crucial for designing ultra-scale algorithms able to harness modern supercomputers which are increasingly complex, including millions of processing cores and heterogeneous building-block devices. In this paper, we investigate the partitioned global address space (PGAS)-based approach using Chapel for the productivity-aware design and implementation of distributed B&B for solving large COPs. The proposed algorithms are intensively experimented using the Flow-shop scheduling problem as a test-case. The Chapel-based implementation is compared to its MPI+X-based traditionally used counterpart in terms of performance, scalability, and productivity. The results show that Chapel is much more expressive and up to 7 . 8 × more productive than MPI+Pthreads. In addition, the Chapel-based search presents performance equivalent to MPI+Pthreads for its best results on 1024 cores and reaches up to 84 % of the linear speedup. However, there are cases where the built-in load balancing provided by Chapel cannot produce regular load among computer nodes. In such cases, the MPI-based search can be up to 4 . 2 × faster and reaches speedups up to 3 × higher than its Chapel-based counterpart. Thorough feedback on the experience is given, pointing out the strengths and limitations of the two opposite approaches (Chapel vs. MPI+X). To the best of our knowledge, the present study is pioneering within the context of exact parallel optimization.
- Subjects :
- POSIX Threads
Speedup
Branch and bound
Job shop scheduling
Computer Networks and Communications
Computer science
PGAS
Branch-and-Bound
020206 networking & telecommunications
02 engineering and technology
Parallel computing
Load balancing (computing)
Hardware and Architecture
Ultra-scale optimization
Chapel
Scalability
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Partitioned global address space
MPI+X
[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC]
computer
Software
computer.programming_language
Subjects
Details
- Language :
- English
- ISSN :
- 0167739X
- Database :
- OpenAIRE
- Journal :
- Future Generation Computer Systems, Future Generation Computer Systems, Elsevier, 2020, SI: On The Road to Exascale II: Advances on High Performance Computing and Simulations, 105, pp.196-209. ⟨10.1016/j.future.2019.11.011⟩, Future Generation Computer Systems, 2020, SI: On The Road to Exascale II: Advances on High Performance Computing and Simulations, 105, pp.196-209. ⟨10.1016/j.future.2019.11.011⟩
- Accession number :
- edsair.doi.dedup.....9bcee97cdad7e1a3358d54e0852c9ac3
- Full Text :
- https://doi.org/10.1016/j.future.2019.11.011⟩