Author: "Guillaume Eynard-Bontemps" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Guillaume Eynard-Bontemps"' showing total 4 results

Start Over Author "Guillaume Eynard-Bontemps"

4 results on '"Guillaume Eynard-Bontemps"'

1. The Pangeo Ecosystem: Interactive Computing Tools for the Geosciences: Benchmarking on HPC.

Author: Tina Erica Odaka, Anderson Banihirwe, Guillaume Eynard-Bontemps, Aurélien Ponte, Guillaume Maze, Kevin Paul, Jared Baker, and Ryan Abernathey
Published: 2019
Full Text: View/download PDF

2. Pangeo@EOSC: deployment of PANGEO ecosystem on the European Open Science Cloud

Author: Guillaume Eynard-Bontemps, Jean Iaquinta, Sebastian Luna-Valero, Miguel Caballer, Frederic Paul, Anne Fouilloux, Benjamin Ragan-Kelley, Pier Lorenzo Marasco, and Tina Odaka
Abstract: Research projects heavily rely on the exchange and processing of data and in this context Pangeo (https://pangeo.io/), a world-wide community of scientists and developers, thrives to facilitate the deployment of ready to use and community-driven platforms for big data geoscience. The European Open Science Cloud (EOSC) is the main initiative in Europe for providing a federated and open multi-disciplinary environment where European researchers, innovators, companies and citizens can share, publish, find and re-use data, tools and services for research, innovation and educational purposes. While a number of services based on Jupyter Notebooks were already available, no public Pangeo deployments providing fast access to large amounts of data and compute resources were accessible on EOSC. Most existing cloud-based Pangeo deployments are USA-based, and members of the Pangeo community in Europe did not have a shared platform where scientists or technologists could exchange know-how. Pangeo teamed up with two EOSC projects, namely EGI-ACE (https://www.egi.eu/project/egi-ace/) and C-SCALE (https://c-scale.eu/) to demonstrate how to deploy and use Pangeo on EOSC and emphasise the benefits for the European community. The Pangeo Europe Community together with EGI deployed a DaskHub, composed of Dask Gateway (https://gateway.dask.org/) and JupyterHub (https://jupyter.org/hub), with Kubernetes cluster backend on EOSC using the infrastructure of the EGI Federation (https://www.egi.eu/egi-federation/). The Pangeo EOSC JupyterHub deployment makes use of 1) the EGI Check-In to enable user registration and thereby authenticated and authorised access to the Pangeo JupyterHub portal and to the underlying distributed compute infrastructure; and 2) the EGI Cloud Compute and the cloud-based EGI Online Storage to distribute the computational tasks to a scalable compute platform and to store intermediate results produced by the user jobs. To facilitate future Pangeo deployments on top of a wide range of cloud providers (AWS, Google Cloud, Microsoft Azure, EGI Cloud Computing, OpenNebula, OpenStack, and more), the Pangeo EOSC JupyterHub deployment is now possible through the Infrastructure Manager (IM) Dashboard (https://im.egi.eu/im-dashboard/login). All the computing and storage resources are currently supplied by CESNET (https://www.cesnet.cz/?lang=en) in the frame of EGI-ACE project (https://im.egi.eu/). Several deployments have been made to serve the geoscience community, both for teaching and for research work. To date, more than 100 researchers have been trained on Pangeo@EOSC deployments and more are expected to join, in particular with easy access to large amounts of Copernicus data through a recent collaboration established with the C-SCALE project. In this presentation, we will provide details on the different deployments, how to get access to JupyterHub deployments and more generally how to contribute to Pangeo@EOSC.
Published: 2023
Full Text: View/download PDF

3. Pangeo framework for training: experience with FOSS4G, the CLIVAR bootcamp and the eScience course

Author: Anne Fouilloux, Pier Lorenzo Marasco, Tina Odaka, Ruth Mottram, Paul Zieger, Michael Schulz, Alejandro Coca-Castro, Jean Iaquinta, and Guillaume Eynard Bontemps
Abstract: The ever increasing number of scientific datasets made available by authoritative data providers (NASA, Copernicus, etc.) and provided by the scientific community opens new possibilities for advancing the state of the art in many areas of the natural sciences. As a result, researchers, innovators, companies and citizens need to acquire computational and data analysis skills to optimally exploit these datasets. Several educational programs dispense basic courses to students, and initiatives such as “The Carpentries” (https://carpentries.org/) complement this offering but also reach out to established researchers to fill the skill gap thereby empowering them to perform their own data analysis. However, most researchers find it challenging to go beyond these training sessions and face difficulties when trying to apply their newly acquired knowledge to their own research projects. To this regard, hackathons have proven to be an efficient way to support researchers in becoming competent practitioners but organising good hackathons is difficult and time consuming. In addition, the need for large amounts of computational and storage resources during the training and hackathons requires a flexible solution. Here, we propose an approach where researchers work on realistic, large and complex data analysis problems similar to or directly part of their research work. Researchers access an infrastructure deployed on the European Ocean Science Cloud (EOSC) that supports intensive data analysis (large compute and storage resources). EOSC is a European Commission initiative for providing a federated and open multi-disciplinary environment where data, tools and services can be shared, published, found and re-used. We used jupyter book for delivering a collection of FAIR training materials for data analysis relying on Pangeo EOSC deployments as its primary computing platform. The training material (https://pangeo-data.github.io/foss4g-2022/intro.html, https://pangeo-data.github.io/clivar-2022/intro.html, https://pangeo-data.github.io/escience-2022/intro.html) is customised (different datasets with similar analysis) for different target communities and participants are taught the usage of Xarray, Dask and more generally how to efficiently access and analyse large online datasets. The training can be completed by group work where attendees can work on larger scale scientific datasets: the classroom is split into several groups. Each group works on different scientific questions and may use different datasets. Using the Pangeo (http://pangeo.io) ecosystem is not always new for all attendees but applying Xarray (http://xarray.pydata.org) and Dask (https://www.dask.org/) on actual scientific “mini-projects” is often a showstopper for many researchers. With this approach, attendees have the opportunity to ask questions, collaborate with other researchers as well as Research Software Engineers, and apply Open Science practices without the burden of trying and failing alone. We find the involvement of scientific computing research engineers directly in the training is crucial for success of the hackathon approach. Feedback from attendees shows that it provides a solid foundation for big data geoscience and helps attendees to quickly become competent practitioners. It also gives infrastructure providers and EOSC useful feedback on the current and future needs of researchers for making their research FAIR and open. In this presentation, we will provide examples of achievements from attendees and present the feedback EOSC providers have received.
Published: 2023
Full Text: View/download PDF

4. The Pangeo Ecosystem: Interactive Computing Tools for the Geosciences: Benchmarking on HPC

Author: Tina Odaka, Jared Baker, Guillaume Eynard-Bontemps, Guillaume Maze, Ryan Abernathey, Anderson Banihirwe, Aurélien Ponte, and Kevin Paul
Subjects: Interactive computing, Scheme (programming language), Computer science, business.industry, Xarray, Distributed computing, Cloud computing, 02 engineering and technology, Benchmarking, interactive computing, Dask, Software, 020204 information systems, Scalability, HPC, 0202 electrical engineering, electronic engineering, information engineering, Data_FILES, Pangeo, cloud, 020201 artificial intelligence & image processing, benchmarking, business, computer, Chunking (computing), computer.programming_language
Abstract: The Pangeo ecosystem is an interactive computing software stack for HPC and public cloud infrastructures. In this paper, we show benchmarking results of the Pangeo platform on two di erent HPC sys- tems. Four di erent geoscience operations were considered in this bench- marking study with varying chunk sizes and chunking schemes. Both strong and weak scaling analyses were performed. Chunk sizes between 64MB to 512MB were considered, with the best scalability obtained for 512MB. Compared to certain manual chunking schemes, the auto chunk- ing scheme scaled well.
Published: 2019

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

4 results on '"Guillaume Eynard-Bontemps"'

1. The Pangeo Ecosystem: Interactive Computing Tools for the Geosciences: Benchmarking on HPC.

2. Pangeo@EOSC: deployment of PANGEO ecosystem on the European Open Science Cloud

3. Pangeo framework for training: experience with FOSS4G, the CLIVAR bootcamp and the eScience course

4. The Pangeo Ecosystem: Interactive Computing Tools for the Geosciences: Benchmarking on HPC

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

4 results on '"Guillaume Eynard-Bontemps"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources