Back to Search Start Over

Technical report 202, PRACE

Authors :
Artigues, Antoni
Houzeaux, Guillaume
Barcelona Supercomputing Center
Source :
UPCommons. Portal del coneixement obert de la UPC, Universitat Politècnica de Catalunya (UPC), Recercat. Dipósit de la Recerca de Catalunya, instname
Publication Year :
2015

Abstract

The Alya System is the BSC simulation code for multi-physics problems [1]. It is based on a Variational Multiscale Finite Element Method for unstructured meshes. Work distribution is achieved by partitioning the original mesh into subdomains (submeshes). This pre-partition step has until now been done in serial by only one process, using the metis library [2]. This is a huge bottleneck when larger meshes with millions of elements have to be partitioned. This is due to the data not fitting in the memory of a single computing node and in the cases where the data does fit; Alya takes too long in the partitioning step. In this document we explain the tasks done to design, implement and test a new parallel partitioning algorithm for Alya. In this algorithm a subset of the workers, is in charge of partition the mesh in parallel, using the parmetis library [3]. Partitioning workers, load consecutive parts of the main mesh, with a parallel space partitioning bin structure [4], capable of obtaining the adjacent boundary elements of their respective submeshes. With this local mesh, each of the partitioning workers is able to create its local element adjacency graph and to partition the mesh. We have validated our new algorithm using a Navier-Stokes problem on a small cube mesh of 1000 elements. Then we performed a scalability test on a 30M element mesh to check if the time to partition the mesh is reduced proportionally with the number of partitioning workers. We have also done a comparison between metis and parmetis, the balancing of the element distribution among the domains, to test how the use of many partitioning workers to partition the mesh affects the scalability of Alya. We have noticed in these tests that it’s better to use fewer partitioning workers to partition the mesh. Finally we have two sections explaining the results and the future work that has to be done in order to finalise and improve the parallel partition algorithm. This work was financially supported by the PRACE project funded in part by the EUs 7th Framework Programme (FP7/2007-2013) under grant agreement no. RI-312763.

Details

Language :
English
Database :
OpenAIRE
Journal :
UPCommons. Portal del coneixement obert de la UPC, Universitat Politècnica de Catalunya (UPC), Recercat. Dipósit de la Recerca de Catalunya, instname
Accession number :
edsair.dedup.wf.001..5555272c469f18e47117a3c317c8ecfd