Janne Soininen, Miska Luoto, F. Guillaume Blanchet, Dominique Gravel, Jane Elith, Ian Renner, Miguel B. Araújo, Jarno Vanhatalo, Niklaus E. Zimmermann, Nicole A. Hill, Aleksi Lehikoinen, Barbara J. Anderson, Anna Norberg, Antoine Guisan, David I. Warton, Jani Anttila, Graeme Newell, William Godsoe, David B. Dunson, John Atle Kålås, Frederick R. Adler, Francis K. C. Hui, Nerea Abrego, Bob O'Hara, Janet Franklin, Heidi K. Mod, Robert D. Holt, Tad A. Dallas, Matt White, Richard Fox, Scott D. Foster, Magne Husby, Otso Ovaskainen, Wilfried Thuiller, Tomas Roslin, Research Foundation of the University of Helsinki, Academy of Finland, Research Council of Norway, Jane and Aatos Erkko Foundation, Ministerio de Ciencia, Innovación y Universidades (España), Organismal and Evolutionary Biology Research Programme, Spatial Foodweb Ecology Group, Department of Agricultural Sciences, Research Centre for Ecological Change, Helsinki Institute of Sustainability Science (HELSUS), Finnish Museum of Natural History, Department of Geosciences and Geography, BioGeoClimate Modelling Lab, Environmental and Ecological Statistics Group, Biostatistics Helsinki, Otso Ovaskainen / Principal Investigator, Laboratoire d'Ecologie Alpine (LECA ), and Université Savoie Mont Blanc (USMB [Université de Savoie] [Université de Chambéry])-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])
A large array of species distribution model (SDM) approaches has been developed for explaining and predicting the occurrences of individual species or species assemblages. Given the wealth of existing models, it is unclear which models perform best for interpolation or extrapolation of existing data sets, particularly when one is concerned with species assemblages. We compared the predictive performance of 33 variants of 15 widely applied and recently emerged SDMs in the context of multispecies data, including both joint SDMs that model multiple species together, and stacked SDMs that model each species individually combining the predictions afterward. We offer a comprehensive evaluation of these SDM approaches by examining their performance in predicting withheld empirical validation data of different sizes representing five different taxonomic groups, and for prediction tasks related to both interpolation and extrapolation. We measure predictive performance by 12 measures of accuracy, discrimination power, calibration, and precision of predictions, for the biological levels of species occurrence, species richness, and community composition. Our results show large variation among the models in their predictive performance, especially for communities comprising many species that are rare. The results do not reveal any major trade-offs among measures of model performance; the same models performed generally well in terms of accuracy, discrimination, and calibration, and for the biological levels of individual species, species richness, and community composition. In contrast, the models that gave the most precise predictions were not well calibrated, suggesting that poorly performing models can make overconfident predictions. However, none of the models performed well for all prediction tasks. As a general strategy, we therefore propose that researchers fit a small set of models showing complementary performance, and then apply a cross-validation procedure involving separate data to establish which of these models performs best for the goal of the study., This work was funded by the Research Foundation of the University of Helsinki (A. Norberg), the Academy of Finland (CoE grant 284601 and grant 309581 to O. Ovaskainen, grant 308651 to N. Abrego, grant 1275606 to A. Lehikoinen), the Research Council of Norway (CoE grant 223257), the Jane and Aatos Erkko Foundation, and the Ministry of Science, Innovation and Universities (grant CGL2015‐68438‐P to M. B. Araújo).