1. COUNTDOWN: A Run-Time Library for Performance-Neutral Energy Saving in MPI Applications.
- Author
-
Cesarini, Daniele, Bartolini, Andrea, Bonfa, Pietro, Cavazzoni, Carlo, and Benini, Luca
- Subjects
- *
ENERGY consumption , *SYNCHRONIZATION , *ELECTRIC power conservation , *ESPRESSO - Abstract
Power and energy consumption are becoming key challenges for the supercomputers’ exascale race. HPC systems’ processors waist active power during communication and synchronization among the MPI processes in large-scale HPC applications. However, due to the time scale at which communication happens, transitioning into low-power states while waiting for the completion of each communication may introduce unacceptable overhead. In this article, we present COUNTDOWN, a run-time library for identifying and automatically reducing the power consumption of the CPUs during communication and synchronization. COUNTDOWN saves energy without penalizing the time-to-completion by lowering CPUs power consumption only during idle times for which power state transition overhead is negligible. This is done transparently to the user, without requiring labor-intensive and error-prone application code modifications, nor requiring recompilation of the application. We test our methodology on a production Tier-1 system. For the NAS benchmarks, COUNTDOWN saves between 6 and 50 percent energy, with a time-to-solution penalty lower than 5 percent. In a complete production—Quantum ESPRESSO—for a 3.5K cores run, COUNTDOWN saves 22.36 percent energy, with a performance penalty below 3 percent. Energy saving increases to 37 percent with a performance penalty of 6.38 percent, if the application is executed without communication tuning. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF