Back to Search
Start Over
Exploring the carbon footprint of Hugging Face's ML models: a repository mining study
- Publication Year :
- 2023
-
Abstract
- Background: The rise of machine learning (ML) systems has exacerbated their carbon footprint due to increased capabilities and model sizes. However, there is scarce knowledge on how the carbon footprint of ML models is actually measured, reported, and evaluated. Aims: This paper analyzes the measurement of the carbon footprint of 1,417 ML models and associated datasets on Hugging Face. Hugging Face is the most popular repository for pretrained ML models. We aim to provide insights and recommendations on how to report and optimize the carbon efficiency of ML models. Method: We conduct the first repository mining study on the Hugging Face Hub API on carbon emissions and answer two research questions: (1) how do ML model creators measure and report carbon emissions on Hugging Face Hub?, and (2) what aspects impact the carbon emissions of training ML models? Results: Key findings from the study include a stalled proportion of carbon emissions-reporting models, a slight decrease in reported carbon footprint on Hugging Face over the past 2 years, and a continued dominance of NLP as the main application domain reporting emissions. The study also uncovers correlations between carbon emissions and various attributes, such as model size, dataset size, ML application domains and performance metrics. Conclusions: The results emphasize the need for software measurements to improve energy reporting practices and the promotion of carbon-efficient model development within the Hugging Face community. To address this issue, we propose two classifications: one for categorizing models based on their carbon emission reporting practices and another for their carbon efficiency. With these classification proposals, we aim to encourage transparency and sustainable model development within the ML community.<br />This work is partially supported by the project TED2021- 130923B-I00, funded by MCIN/AEI/10.13039/501100011033 and the European Union Next Generation EU/PRTR.<br />Peer Reviewed<br />Postprint (author's final draft)
Details
- Database :
- OAIster
- Notes :
- application/pdf, English
- Publication Type :
- Electronic Resource
- Accession number :
- edsoai.on1417307916
- Document Type :
- Electronic Resource