Using the Gini coefficient to characterize the shape of computational chemistry error distributions

Authors :: Pascal Pernot
Andreas Savin
Institut de Chimie Physique (ICP)
Institut de Chimie du CNRS (INC)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)
Laboratoire de chimie théorique (LCT)
Institut de Chimie du CNRS (INC)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)
Source :: Theoretical Chemistry Accounts: Theory, Computation, and Modeling, Theoretical Chemistry Accounts: Theory, Computation, and Modeling, Springer Verlag, 2021, 140 (3), pp.24. ⟨10.1007/s00214-021-02725-0⟩
Publication Year :: 2021
Publisher :: HAL CCSD, 2021.
Abstract: International audience; The distribution of errors is a central object in the assesment and benchmarking of computational chemistry methods. The popular and often blind use of the mean unsigned error as a benchmarking statistic leads to ignore distributions features that impact the reliability of the tested methods. We explore how the Gini coefficient offers a global representation of the errors distribution, but, except for extreme values, does not enable an unambiguous diagnostic. We propose to relieve the ambiguity by applying the Gini coefficient to mode-centered error distributions. This version can usefully complement benchmarking statistics and alert on error sets with potentially problematic shapes.

Language :: English
ISSN :: 1432881X and 14322234
Database :: OpenAIRE
Journal :: Theoretical Chemistry Accounts: Theory, Computation, and Modeling, Theoretical Chemistry Accounts: Theory, Computation, and Modeling, Springer Verlag, 2021, 140 (3), pp.24. ⟨10.1007/s00214-021-02725-0⟩
Accession number :: edsair.doi.dedup.....6e9eb8da2c1104b0c0388ef02762e8d5
Full Text :: https://doi.org/10.1007/s00214-021-02725-0⟩

Full Text Access

Tools