Back to Search Start Over

Pipelining the Fast Multipole Method over a Runtime System

Authors :
Agullo, Emmanuel
Bramas, Bérenger
Coulaud, Olivier
Darve, Eric
Messner, Matthias
Takahashi, Toru
Laboratoire Bordelais de Recherche en Informatique (LaBRI)
Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)
High-End Parallel Algorithms for Challenging Numerical Simulations (HiePACS)
Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Inria Bordeaux - Sud-Ouest
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)
Department of Mechanical Engineering [Stanford]
Stanford University
Institute for Computational and Mathematical Engineering [Stanford] (ICME)
Department of Mechanical Science and Engineering
Nagoya University
Equipe associée FastLA
INRIA
PlaFRIM
FastLA
Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS)-Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS)-Inria Bordeaux - Sud-Ouest
FastLA-PLAFRIM
HiePACS
Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS)
Source :
[Research Report] RR-7981, INRIA. 2012, pp.24, SIAM Conference on Computational Science and Engineering (SIAM CSE 2013), SIAM Conference on Computational Science and Engineering (SIAM CSE 2013), Feb 2013, Boston, United States
Publication Year :
2012
Publisher :
HAL CCSD, 2012.

Abstract

Fast Multipole Methods (FMM) are a fundamental operation for the simulation of many physical problems. The high performance design of such methods usually requires to carefully tune the algorithm for both the targeted physics and the hardware. In this paper, we propose a new approach that achieves high performance across architectures. Our method consists of expressing the FMM algorithm as a task flow and employing a state-of-the-art runtime system, StarPU, in order to process the tasks on the different processing units. We carefully design the task flow, the mathematical operators, their Central Processing Unit (CPU) and Graphics Processing Unit (GPU) implementations, as well as scheduling schemes. We compute potentials and forces of 200 million particles in 48.7 seconds on a homogeneous 160 cores SGI Altix UV 100 and of 38 million particles in 13.34 seconds on a heterogeneous 12 cores Intel Nehalem processor enhanced with 3 Nvidia M2090 Fermi GPUs.<br />No. RR-7981 (2012)

Details

Language :
English
Database :
OpenAIRE
Journal :
[Research Report] RR-7981, INRIA. 2012, pp.24, SIAM Conference on Computational Science and Engineering (SIAM CSE 2013), SIAM Conference on Computational Science and Engineering (SIAM CSE 2013), Feb 2013, Boston, United States
Accession number :
edsair.doi.dedup.....b27b658129714e8b089a287b1bc2615e