Back to Search Start Over

Towards Discovering SARS-CoV-2 Variants of High Consequence Based on Both Surveillance and Electronically Captured Health Data: First Year Experience in Washington State (January 2020-2021)

Authors :
Lue Ping Zhao
Pavitra Roychoudhury
Keith R. Jerome
Peter B. Gilbert
Joshua Schiffer
Terry Lybrand
Thomas H. Payne
April Randhawa
Margaret G. Mills
Alex Greninger
Chul-woo Pyo
Ruihan Wang
Renyu Li
Alexander S. Thomas
Brandon M. Norris
Wyatt C. Nelson
Daniel E. Geraghty
Source :
SSRN Electronic Journal.
Publication Year :
2021
Publisher :
Elsevier BV, 2021.

Abstract

Background: SARS-CoV-2 is continuously evolving with the emergence of variants of interest (VOI) or with variants of concern (VOC). While Variants of High Consequence (VOHC) are well defined, no such variants have been formally documented. Here we propose an integrated strategy and application towards discovering VOHC. Methods: We utilized 7,137 viral sequences collected from COVID-19 cases in Washington State from January 19, 2020 to January 31, 2021, to identify genome-wide viral single nucleotide variants (SNVs). Utilizing a non-parametric regression model, we selected a subset of SNVs that had significant and substantial expansions over the collection period. Further, using unsupervised learning, we identified multiple SNVs forming haplotypes. To evaluate their clinical relevance, we assembled a discovery cohort of COVID-19 cases (388 inpatients and 295 outpatients) to identify SNVs and haplotypes associated with hospitalization status, a proxy for disease severity. A logistic regression model was used to assess associations of SNVs with hospitalization status in the discovery cohort. These results were validated on an independent cohort of 964 genome sequences derived from COVID-19 cases in Washington State from June 1, 2020 to March 31, 2021. Finding: The analysis of the 7,137 sequences led to identification of 107 SNVs that were statistically significant (false positive error rate q-value 0.10). Forty-one SNVs were considered urgent, because their SNV proportions persisted or expanded above 10% in January 2021, the last month of the current investigation period. Correlating with clinical data, eight SNVs were found to significantly associate with inpatient status (p-values

Details

ISSN :
15565068
Database :
OpenAIRE
Journal :
SSRN Electronic Journal
Accession number :
edsair.doi...........8667b2344b600f849d177a0dd54df80d
Full Text :
https://doi.org/10.2139/ssrn.3893567