1. A rigorous benchmarking of methods for SARS-CoV-2 lineage abundance estimation in wastewater
- Author
-
Munteanu, Viorel, Gordeev, Victor, Saldana, Michael, Aßmann, Eva, Su, Justin Maine, Drabcinski, Nicolae, Zlenko, Oksana, Kit, Maryna, Iordachi, Felicia, Patel, Khooshbu Kantibhai, Nahid, Abdullah Al, Chittampalli, Likhitha, Xu, Yidian, Skums, Pavel, Agrawal, Shelesh, Hölzer, Martin, Smith, Adam, Zelikovsky, Alex, and Mangul, Serghei
- Subjects
Quantitative Biology - Genomics - Abstract
In light of the continuous transmission and evolution of SARS-CoV-2 coupled with a significant decline in clinical testing, there is a pressing need for scalable, cost-effective, long-term, passive surveillance tools to effectively monitor viral variants circulating in the population. Wastewater genomic surveillance of SARS-CoV-2 has arrived as an alternative to clinical genomic surveillance, allowing to continuously monitor the prevalence of viral lineages in communities of various size at a fraction of the time, cost, and logistic effort and serving as an early warning system for emerging variants, critical for developed communities and especially for underserved ones. Importantly, lineage prevalence estimates obtained with this approach aren't distorted by biases related to clinical testing accessibility and participation. However, the relative performance of bioinformatics methods used to measure relative lineage abundances from wastewater sequencing data is unknown, preventing both the research community and public health authorities from making informed decisions regarding computational tool selection. Here, we perform comprehensive benchmarking of 18 bioinformatics methods for estimating the relative abundance of SARS-CoV-2 (sub)lineages in wastewater by using data from 36 in vitro mixtures of synthetic lineage and sublineage genomes. In addition, we use simulated data from 78 mixtures of lineages and sublineages co-occurring in the clinical setting with proportions mirroring their prevalence ratios observed in real data. Importantly, we investigate how the accuracy of the evaluated methods is impacted by the sequencing technology used, the associated error rate, the read length, read depth, but also by the exposure of the synthetic RNA mixtures to wastewater, with the goal of capturing the effects induced by the wastewater matrix, including RNA fragmentation and degradation., Comment: For correspondence: serghei.mangul@gmail.com
- Published
- 2023