Back to Search
Start Over
Mapping and scheduling HPC applications for optimizing I/O
- Source :
- ICS2020-34th ACM International Conference on Supercomputing, ICS2020-34th ACM International Conference on Supercomputing, Jun 2020, Barcelona, Spain, e-Archivo. Repositorio Institucional de la Universidad Carlos III de Madrid, instname, ICS
- Publication Year :
- 2020
- Publisher :
- Association For Computing Machinery (Acm), 2020.
-
Abstract
- In HPC platforms, concurrent applications are sharing the same file system. This can lead to conflicts, especially as applications are more and more data intensive. I/O contention can represent a performance bottleneck. The access to bandwidth can be split in two complementary yet distinct problems. The mapping problem and the scheduling problem. The mapping problem consists in selecting the set of applications that are in competition for the I/O resource. The scheduling problem consists then, given I/O requests on the same resource, in determining the order to these accesses to minimize the I/O time. In this work we propose to couple a novel bandwidth-aware mapping algorithm to I/O list-scheduling policies to develop a cross-layer optimization solution. We study this solution experimentally using an I/O middleware: CLARISSE. We show that naive policies such as FIFO perform relatively well in order to schedule I/O movements, and that the important part to reduce congestion lies mostly on the mapping part. We evaluate the algorithm that we propose using a simulator that we validated experimentally. This evaluation shows important gains for the simple, bandwidth-aware mapping solution that we provide compared to its non bandwidth-aware counterpart. The gains are both in terms of machine efficiency (makespan) and application efficiency (stretch). This stresses even more the importance of designing efficient, bandwidth-aware mapping strategies to alleviate the cost of I/O congestion. This work was supported in part by the French National Research Agency (ANR) in the frame of DASH (ANR-17-CE25-0004). Some of the experiments presented in this paper were carried out using the PlaFRIM experimental testbed, supported by Inria, CNRS (LABRI and IMB), Université de Bordeaux, Bordeaux INP and Conseil Régional d’Aquitaine (see https://www.plafrim.fr/).
- Subjects :
- File system
Informática
020203 distributed computing
Job shop scheduling
I/O scheduling
I/O contention
Computer science
cross-layer optimizations
Distributed computing
[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]
020206 networking & telecommunications
02 engineering and technology
computer.software_genre
Bottleneck
Scheduling (computing)
Mapping algorithm
0202 electrical engineering, electronic engineering, information engineering
Mechanical efficiency
MPI
Cross-layer optimizations
[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC]
Schedule I
computer
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Journal :
- ICS2020-34th ACM International Conference on Supercomputing, ICS2020-34th ACM International Conference on Supercomputing, Jun 2020, Barcelona, Spain, e-Archivo. Repositorio Institucional de la Universidad Carlos III de Madrid, instname, ICS
- Accession number :
- edsair.doi.dedup.....6d10f5ad012e6097c4b0d9506bca44d5