1. BEpipeR: a user-friendly, flexible, and scalable data synthesis pipeline for the Biodiversity Exploratories and other research consortia [version 1; peer review: awaiting peer review]
- Author
-
Marcel Glück, Oliver Bossdorf, and Henri A. Thomassen
- Subjects
Software Tool Article ,Articles ,Research consortia ,large-scale long-term environmental research ,environmental data ,data democratization and utilization ,reproducibility ,R programming language ,Biodiversity Exploratories ,BExIS - Abstract
Background Large research consortia can generate tremendous amounts of biological information, including high-resolution soil, vegetation, and climate data. While this knowledge stock holds invaluable potential for answering evolutionary and ecological questions, making these data exploitable for modelling remains a daunting task due to the many processing steps required for synthesis. This might result in many researchers to fall back to a handful of ready-to-use data sets, potentially at the expense of statistical power and scientific rigour. In a push for a more stringent approach, we introduce BEpipeR, an R pipeline that allows for the streamlined synthesis of plot-based Biodiversity Exploratories data. Methods BEpipeR was designed with flexibility and ease of use in mind. For instance, users simply choose between aggregating forest or grassland data, or a combination thereof, effectively allowing them to process any experimental plot data of this research consortium. Additionally, instead of coding, they parse most processing information in a user-friendly way through parameter sheets. Processing includes, among others, the creation of a spatially explicit plot-ID template, data wrangling, quality control, plot-wise aggregations, the calculation of derived metrics, data joining to a large composite data set, and metadata compilation. Results With BEpipeR, we provide a feature-rich pipeline that allows users to process Biodiversity Exploratories data in a flexible and reproducible way. This pipeline might serve as a starting point for aggregating the numerous data sets of this and potentially similar research consortia. In this way, it might be a primer for the construction of consortia-wide composite data sets that take full advantage of the consortia’s rich information stocks, ultimately boosting the visibility and participation of individual research projects. Conclusions The BEpipeR pipeline permits the user-friendly processing and plot-wise aggregation of Biodiversity Exploratories data. With modifications, this framework may be easily adopted by other research consortia.
- Published
- 2024
- Full Text
- View/download PDF