Back to Search Start Over

Portable, Scalable, and Reproducible Scientific Computing: from Cloud to HPC

Authors :
Stubbs, Joe
Jamthe, Anagha
Black, Steve
Cleveland, Sean
Looney, Julia
Publication Year :
2021
Publisher :
Zenodo, 2021.

Abstract

This tutorial will focus on providing attendees exposure to state-of-the-art techniques for portable, reproducible research computing, enabling them to easily transport analyses from cloud to HPC resources, share computations with collaborators and disseminate final results to communities of interest. We will introduce various open source technologies, including Jupyter, Docker and Singularity, and show how to utilize these tools within the NSF-funded Tapis v3 platform, an Application Program Interface (API) for distributed computation. After a brief introduction to the open source technologies above, this tutorial will be focused on hands-on exercises in which the attendees will build a portable analysis that can be seamlessly moved to different execution environments, including a small virtual machine and a national-scale supercomputer. Using techniques covered in the tutorial, attendees will also be able to easily share their results with one or more additional users. The tutorial will make use of a specific machine learning image classifier analysis to illustrate the concepts, but the techniques introduced can be applied to a broad class of analyses in virtually any domain of science or engineering. Description and Format: TACC training accounts will be set up for all registered attendees, which will have access to allocations on XSEDE cloud systems and one or more HPC resources such as TACC’s Stampede2 or Frontera. The tutorials will be hands-on exercises, where the attendees will interact with the Tapis v3 services within a Jupyter notebook. Registered attendees will be notified with their account details closer to the tutorial date. All the course materials will be published on github pages so the attendees will have access to them during and after the tutorial. We will have enough proctors throughout the session, who will help attendees through slack or breakout sessions. Proposed tutorial schedule is as shown in Table 1. Learning Outcomes: In this tutorial, attendees will gain an understanding of the concepts of using container technology (Docker, Singularity) for portable analysis, programmatically executing analyses in both Cloud and HPC environments using an API, interacting with and visualizing the results in Jupyter notebooks and sharing results with collaborators. By the end of this workshop attendees will be able to: • Have a basic understanding of Docker and Singularity containers in relation to computational research. • Use Tapis to access HPC storage and compute resources in a programmatic and reproducible way. • Utilize Jupyter notebooks for interactive computing. • Use Tapis to share results with others. Content Level and Length: Beginner 70%, Intermediate 30% 3 hours. Audience Prerequisites: Basic familiarity with Jupyter notebooks and Python will be helpful. Attendees must use their own laptop for the hands-on part of the tutorial.

Details

Database :
OpenAIRE
Accession number :
edsair.doi.dedup.....16e25d327f2b1e82892af76cfdc52709
Full Text :
https://doi.org/10.5281/zenodo.5570274