Back to Search
Start Over
Towards Portable Online Prediction of Network Utilization using MPI-level Monitoring
- Source :
- EuroPar'19: 25th International European Conference on Parallel and Distributed Systems, EuroPar'19: 25th International European Conference on Parallel and Distributed Systems, Aug 2019, Goettingen, Germany, Lecture Notes in Computer Science ISBN: 9783030293994, Euro-Par
- Publication Year :
- 2019
- Publisher :
- HAL CCSD, 2019.
-
Abstract
- International audience; Stealing network bandwidth helps a variety of HPC runtimes and services to run additional operations in the background without negatively affecting the applications. A key ingredient to make this possible is an accurate prediction of the future network utilization, enabling the runtime to plan the background operations in advance, such as to avoid competing with the application for network bandwidth. In this paper, we propose a portable deep learning predictor that only uses the information available through MPI introspection to construct a recurrent sequence-to-sequence neural network capable of forecasting network utilization. We leverage the fact that most HPC applications exhibit periodic behaviors to enable predictions far into the future (at least the length of a period). Our on-line approach does not have an initial training phase, it continuously improves itself during application execution without incurring significant computational overhead. Experimental results show better accuracy and lower computational overhead compared with the state-of-the-art on two representative applications.
- Subjects :
- Artificial neural network
business.industry
Computer science
Network monitoring
Deep learning
Distributed computing
010103 numerical & computational mathematics
010501 environmental sciences
Prediction of resource utilization
01 natural sciences
Timeseries forecasting
Work stealing
Online learning
Leverage (statistics)
Artificial intelligence
0101 mathematics
[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC]
business
0105 earth and related environmental sciences
Subjects
Details
- Language :
- English
- ISBN :
- 978-3-030-29399-4
- ISBNs :
- 9783030293994
- Database :
- OpenAIRE
- Journal :
- EuroPar'19: 25th International European Conference on Parallel and Distributed Systems, EuroPar'19: 25th International European Conference on Parallel and Distributed Systems, Aug 2019, Goettingen, Germany, Lecture Notes in Computer Science ISBN: 9783030293994, Euro-Par
- Accession number :
- edsair.doi.dedup.....4eb2b384668da872bc5721c29a62efbc