Back to Search
Start Over
Pangeo@EOSC: deployment of PANGEO ecosystem on the European Open Science Cloud
- Publication Year :
- 2023
- Publisher :
- Copernicus GmbH, 2023.
-
Abstract
- Research projects heavily rely on the exchange and processing of data and in this context Pangeo (https://pangeo.io/), a world-wide community of scientists and developers, thrives to facilitate the deployment of ready to use and community-driven platforms for big data geoscience. The European Open Science Cloud (EOSC) is the main initiative in Europe for providing a federated and open multi-disciplinary environment where European researchers, innovators, companies and citizens can share, publish, find and re-use data, tools and services for research, innovation and educational purposes. While a number of services based on Jupyter Notebooks were already available, no public Pangeo deployments providing fast access to large amounts of data and compute resources were accessible on EOSC. Most existing cloud-based Pangeo deployments are USA-based, and members of the Pangeo community in Europe did not have a shared platform where scientists or technologists could exchange know-how. Pangeo teamed up with two EOSC projects, namely EGI-ACE (https://www.egi.eu/project/egi-ace/) and C-SCALE (https://c-scale.eu/) to demonstrate how to deploy and use Pangeo on EOSC and emphasise the benefits for the European community. The Pangeo Europe Community together with EGI deployed a DaskHub, composed of Dask Gateway (https://gateway.dask.org/) and JupyterHub (https://jupyter.org/hub), with Kubernetes cluster backend on EOSC using the infrastructure of the EGI Federation (https://www.egi.eu/egi-federation/). The Pangeo EOSC JupyterHub deployment makes use of 1) the EGI Check-In to enable user registration and thereby authenticated and authorised access to the Pangeo JupyterHub portal and to the underlying distributed compute infrastructure; and 2) the EGI Cloud Compute and the cloud-based EGI Online Storage to distribute the computational tasks to a scalable compute platform and to store intermediate results produced by the user jobs. To facilitate future Pangeo deployments on top of a wide range of cloud providers (AWS, Google Cloud, Microsoft Azure, EGI Cloud Computing, OpenNebula, OpenStack, and more), the Pangeo EOSC JupyterHub deployment is now possible through the Infrastructure Manager (IM) Dashboard (https://im.egi.eu/im-dashboard/login). All the computing and storage resources are currently supplied by CESNET (https://www.cesnet.cz/?lang=en) in the frame of EGI-ACE project (https://im.egi.eu/). Several deployments have been made to serve the geoscience community, both for teaching and for research work. To date, more than 100 researchers have been trained on Pangeo@EOSC deployments and more are expected to join, in particular with easy access to large amounts of Copernicus data through a recent collaboration established with the C-SCALE project. In this presentation, we will provide details on the different deployments, how to get access to JupyterHub deployments and more generally how to contribute to Pangeo@EOSC.
Details
- Database :
- OpenAIRE
- Accession number :
- edsair.doi...........10595ff616563ce8cd406aace1aa57c4
- Full Text :
- https://doi.org/10.5194/egusphere-egu23-9095