Back to Search Start Over

How much should one sample to accurately predict the distribution of species assemblages? A virtual community approach.

Authors :
Fernandes, Rui F.
Scherrer, Daniel
Guisan, Antoine
Source :
Ecological Informatics; Nov2018, Vol. 48, p125-134, 10p
Publication Year :
2018

Abstract

Abstract Correlative species distribution models (SDMs) are widely used to predict species distributions and assemblages, with many fundamental and applied uses. Different factors were shown to affect SDM prediction accuracy. However, real data cannot give unambiguous answers on these issues, and for this reason, artificial data have been increasingly used in recent years. Here, we move one step further by assessing how different factors can affect the prediction accuracy of virtual assemblages obtained by stacking individual SDM predictions (stacked SDMs, S-SDM). We modelled 100 virtual species in a real study area, testing five different factors: sample size (200-800-3200), sampling method (nested, non-nested), sampling prevalence (25%, 50%, 75% and species true prevalence), modelling technique (GAM, GLM, BRT and RF) and thresholding method (ROC, MaxTSS, and MaxKappa). We showed that the accuracy of S-SDM predictions is mostly affected by modelling technique followed by sample size. Models fitted by GAM/GLM had a higher accuracy and lower variance than BRT/RF. Model accuracy increased with sample size and a sampling strategy reflecting the true prevalence of the species was most successful. However, even with sample sizes as high as >3000 sites, residual uncertainty remained in the predictions, potentially reflecting a bias introduced by creating and/or resampling the virtual species. Therefore, when evaluating the accuracy of predictions from S-SDMs fitted with real field data, one can hardly expect reaching perfect accuracy, and reasonably high values of similarity or predictive success can already be seen as valuable predictions. We recommend the use of a 'plot-like' sampling method (best approximation of the species' true prevalence) and not simply increasing the number of presences-absences of species. As presented here, virtual simulations might be used more systematically in future studies to inform about the best accuracy level that one could expect given the characteristics of the data and the methods used to fit and stack SDMs. Highlights • Virtual species were used to test factors affecting spatial assemblage predictions. • Modelling technique and sample size proved to be the most important factors. • Sampling procedures caused uncertainty in predictions even under a "known" truth. • A 'plot-like' sampling method ensures accurate predictions even at small samples. • Confirmed the importance of evaluating relative effects of factors affecting SDMs. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
15749541
Volume :
48
Database :
Supplemental Index
Journal :
Ecological Informatics
Publication Type :
Academic Journal
Accession number :
133191446
Full Text :
https://doi.org/10.1016/j.ecoinf.2018.09.002