Back to Search Start Over

Mapping and scheduling HPC applications for optimizing I/O

Authors :
Jesus Carretero
Emmanuel Jeannot
Nicolas Vidal
Guillaume Pallez
David E. Singh
Universidad Carlos III de Madrid [Madrid] (UC3M)
Topology-Aware System-Scale Data Management for High-Performance Computing (TADAAM)
Laboratoire Bordelais de Recherche en Informatique (LaBRI)
Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Inria Bordeaux - Sud-Ouest
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)
ANR-17-CE25-0004,DASH,Ordonnancement de données pour le calcul haute-performance(2017)
Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS)-Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS)-Inria Bordeaux - Sud-Ouest
Source :
ICS2020-34th ACM International Conference on Supercomputing, ICS2020-34th ACM International Conference on Supercomputing, Jun 2020, Barcelona, Spain, e-Archivo. Repositorio Institucional de la Universidad Carlos III de Madrid, instname, ICS
Publication Year :
2020
Publisher :
Association For Computing Machinery (Acm), 2020.

Abstract

In HPC platforms, concurrent applications are sharing the same file system. This can lead to conflicts, especially as applications are more and more data intensive. I/O contention can represent a performance bottleneck. The access to bandwidth can be split in two complementary yet distinct problems. The mapping problem and the scheduling problem. The mapping problem consists in selecting the set of applications that are in competition for the I/O resource. The scheduling problem consists then, given I/O requests on the same resource, in determining the order to these accesses to minimize the I/O time. In this work we propose to couple a novel bandwidth-aware mapping algorithm to I/O list-scheduling policies to develop a cross-layer optimization solution. We study this solution experimentally using an I/O middleware: CLARISSE. We show that naive policies such as FIFO perform relatively well in order to schedule I/O movements, and that the important part to reduce congestion lies mostly on the mapping part. We evaluate the algorithm that we propose using a simulator that we validated experimentally. This evaluation shows important gains for the simple, bandwidth-aware mapping solution that we provide compared to its non bandwidth-aware counterpart. The gains are both in terms of machine efficiency (makespan) and application efficiency (stretch). This stresses even more the importance of designing efficient, bandwidth-aware mapping strategies to alleviate the cost of I/O congestion. This work was supported in part by the French National Research Agency (ANR) in the frame of DASH (ANR-17-CE25-0004). Some of the experiments presented in this paper were carried out using the PlaFRIM experimental testbed, supported by Inria, CNRS (LABRI and IMB), Université de Bordeaux, Bordeaux INP and Conseil Régional d’Aquitaine (see https://www.plafrim.fr/).

Details

Language :
English
Database :
OpenAIRE
Journal :
ICS2020-34th ACM International Conference on Supercomputing, ICS2020-34th ACM International Conference on Supercomputing, Jun 2020, Barcelona, Spain, e-Archivo. Repositorio Institucional de la Universidad Carlos III de Madrid, instname, ICS
Accession number :
edsair.doi.dedup.....6d10f5ad012e6097c4b0d9506bca44d5