Back to Search Start Over

Towards Portable Online Prediction of Network Utilization using MPI-level Monitoring

Authors :
Emmanuel Jeannot
Franck Cappello
Shu Mei Tseng
Bogdan Nicolae
Aparna Chandramowlishwaran
George Bosilca
University of California [Irvine] (UCI)
University of California
Argonne National Laboratory [Lemont] (ANL)
The University of Tennessee [Knoxville]
Topology-Aware System-Scale Data Management for High-Performance Computing (TADAAM)
Laboratoire Bordelais de Recherche en Informatique (LaBRI)
Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Inria Bordeaux - Sud-Ouest
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)
This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. This material was based upon work supported by the U.S. Department of Energy, Office of Science, under contract DE-AC02-06CH11357,and by the National Science Foundation under Grant No. #1664142. The experimentspresented in this paper were carried out using the Grid’5000/ALADDIN-G5K experimental testbed, an initiative of the French Ministry of Research through the ACI GRID incentive action, INRIA, CNRS and RENATER and other contributing partners (see http://www.grid5000.fr/).
University of California [Irvine] (UC Irvine)
University of California (UC)
Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS)-Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS)-Inria Bordeaux - Sud-Ouest
Source :
EuroPar'19: 25th International European Conference on Parallel and Distributed Systems, EuroPar'19: 25th International European Conference on Parallel and Distributed Systems, Aug 2019, Goettingen, Germany, Lecture Notes in Computer Science ISBN: 9783030293994, Euro-Par
Publication Year :
2019
Publisher :
HAL CCSD, 2019.

Abstract

International audience; Stealing network bandwidth helps a variety of HPC runtimes and services to run additional operations in the background without negatively affecting the applications. A key ingredient to make this possible is an accurate prediction of the future network utilization, enabling the runtime to plan the background operations in advance, such as to avoid competing with the application for network bandwidth. In this paper, we propose a portable deep learning predictor that only uses the information available through MPI introspection to construct a recurrent sequence-to-sequence neural network capable of forecasting network utilization. We leverage the fact that most HPC applications exhibit periodic behaviors to enable predictions far into the future (at least the length of a period). Our on-line approach does not have an initial training phase, it continuously improves itself during application execution without incurring significant computational overhead. Experimental results show better accuracy and lower computational overhead compared with the state-of-the-art on two representative applications.

Details

Language :
English
ISBN :
978-3-030-29399-4
ISBNs :
9783030293994
Database :
OpenAIRE
Journal :
EuroPar'19: 25th International European Conference on Parallel and Distributed Systems, EuroPar'19: 25th International European Conference on Parallel and Distributed Systems, Aug 2019, Goettingen, Germany, Lecture Notes in Computer Science ISBN: 9783030293994, Euro-Par
Accession number :
edsair.doi.dedup.....4eb2b384668da872bc5721c29a62efbc