1. Learning Intermediate Representations using Graph Neural Networks for NUMA and Prefetchers Optimization
- Author
-
TehraniJamsaz, Ali, Popov, Mihail, Dutta, Akash, Saillard, Emmanuelle, Jannesari, Ali, Iowa State University (ISU), STatic Optimizations, Runtime Methods (STORM), Laboratoire Bordelais de Recherche en Informatique (LaBRI), Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS)-Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS)-Inria Bordeaux - Sud-Ouest, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), We would like to thank the Research IT team of Iowa State University for their continuous support in providing access to HPC clusters for conducting the experiments of this research project. Experiments presented in this paper were carried out using the experimental testbeds PlaFRIM, supported by Inria, CNRS (LABRI and IMB), Universite de Bordeaux, Bordeaux INP, Conseil Regional d’Aquitaine, and Grid’5000, supported by a scientific interest group hosted by Inria, CNRS, RENATER and several Universities as well as other organizations., and Grid'5000
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial Intelligence (cs.AI) ,NUMA ,Computer Science - Distributed, Parallel, and Cluster Computing ,Computer Science - Artificial Intelligence ,graph neural networks ,OpenMP ,LLVM Intermediate Representation ,[INFO]Computer Science [cs] ,Distributed, Parallel, and Cluster Computing (cs.DC) ,prefetching ,Machine Learning (cs.LG) - Abstract
International audience; There is a large space of NUMA and hardware prefetcher configurations that can significantly impact the performance of an application. Previous studies have demonstrated how a model can automatically select configurations based on the dynamic properties of the code to achieve speedups. This paper demonstrates how the static Intermediate Representation (IR) of the code can guide NUMA/prefetcher optimizations without the prohibitive cost of performance profiling. We propose a method to create a comprehensive dataset that includes a diverse set of intermediate representations along with optimum configurations. We then apply a graph neural network model in order to validate this dataset. We show that our static intermediate representation based model achieves 80% of the performance gains provided by expensive dynamic performance profiling based strategies. We further develop a hybrid model that uses both static and dynamic information. Our hybrid model achieves the same gains as the dynamic models but at a reduced cost by only profiling 30% of the programs.
- Published
- 2022