Back to Search Start Over

Reproducible Cross-border High Performance Computing for Scientific Portals

Authors :
Kessy Abarenkov
Anne Fouilloux
Helmut Neukirchen
Abdulrahman Azab
Source :
2nd Workshop on Reproducible Workflows, Data Management, and Security
Publication Year :
2022
Publisher :
arXiv, 2022.

Abstract

To reproduce eScience, several challenges need to be solved: scientific workflows need to be automated; the involved software versions need to be provided in an unambiguous way; input data needs to be easily accessible; High-Performance Computing (HPC) clusters are often involved and to achieve bit-to-bit reproducibility, it might be even necessary to execute the code on a particular cluster to avoid differences caused by different HPC platforms (and unless this is a scientist's local cluster, it needs to be accessed across (administrative) borders). Preferably, to allow even inexperienced users to (re-)produce results, all should be user-friendly. While some easy-to-use web-based scientific portals support already to access HPC resources, this typically only refers to computing and data resources that are local. By the example of two community-specific portals in the fields of biodiversity and climate research, we present a solution for accessing remote HPC (and cloud) compute and data resources from scientific portals across borders, involving rigorous container-based packaging of the software version and setup automation, thus enhancing reproducibility.<br />Comment: Accepted at 2nd Workshop on Reproducible Workflows, Data Management, and Security. During eScience in Salt Lake City, Utah, USA. 11-14 October 2022

Details

Database :
OpenAIRE
Journal :
2nd Workshop on Reproducible Workflows, Data Management, and Security
Accession number :
edsair.doi.dedup.....054441faf7bfa976f8fe38b278291d9c
Full Text :
https://doi.org/10.48550/arxiv.2209.00596