1. Species profiles support recommendations for quality filtering of opportunistic citizen science data
- Author
-
Camille Van Eupen, Dirk Maes, Marc Herremans, Kristijn R.R. Swinnen, Ben Somers, and Stijn Luca
- Subjects
Animal Ecology and Physiology ,Presence-only data ,Ecological Modeling ,ACCURACY ,CONSERVATION ,UNCERTAINTY ,PERFORMANCE ,DISTRIBUTION MODELS ,Mathematics and Statistics ,BIAS ,Earth and Environmental Sciences ,SAMPLE-SIZE ,BUTTERFLIES ,Opportunistic data ,PREDICTION ERRORS ,Data quality filtering ,Species distribution models ,Species traits ,Filtering recommendations ,TRAITS - Abstract
Opportunistic citizen science data are commonly filtered in an attempt to improve their applicability for relating species occurrences with environmental variables. Recommendations on when and how to filter, however, have remained relatively general and associations between species traits and filtering recommendations are sparse. We collected six traits (body size, detectability, classification error rate, familiarity, reporting probability and range size) of 52 birds, 25 butterflies and 14 dragonflies. Both absolute (values not rescaled) and relative traits (values rescaled per taxonomic group) were linked to filter effects, i.e. the impact on three different measures of species distribution model performance caused by applying three different quality filters, for different degrees of sample size reduction. First, we applied multiple regressions that predicted the filter effects by either absolute (including taxonomic group) or relative traits. Second, a principal component and clustering analysis were performed to define five species profiles based on species traits that were retained after a multiple regression model selection. The analysis of the profiles indicated the relative importance of species traits and revealed new insights into the association of species traits with changes in model performance after data quality filtering. Both taxonomic group (more than absolute traits) and relative species traits (mainly classification error rate, range size and familiarity) defined the impact of data quality filtering on model performance and we discourage the selection of a quality filtering strategy based on one single species trait. Results further confirmed the importance of considering the goal of the study (i.e. increasing model discrimination capacity, sensitivity or specificity) as well as the change in sample size caused by stringent filtering. The general species knowledge amongst citizen scientists (importance of observer experience), together with the mechanism of record verification in an opportunistic data platform (importance of verifiable metadata) have the largest potential for enhancing the quality of opportunistic records.
- Published
- 2022