Back to Search Start Over

Leveraging sequences missing from the human genome to diagnose cancer

Authors :
Ilias Georgakopoulos-Soares
Ofer Yizhar Barnea
Ioannis Mouratidis
Candace S.Y. Chan
Rachael Bradley
Mayank Mahajan
Jasmine Sims
Dianne Laboy Cintron
Ryder Easterlin
Julia S. Kim
Emmalyn Chen
Geovanni Pineda
Guillermo E. Parada
John S. Witte
Christopher A. Maher
Felix Feng
Ioannis Vathiotis
Nikolaos Syrigos
Emmanouil Panagiotou
Andriani Charpidou
Konstantinos Syrigos
Jocelyn Chapman
Mark Kvale
Martin Hemberg
Nadav Ahituv
Publication Year :
2021
Publisher :
Cold Spring Harbor Laboratory, 2021.

Abstract

Cancer diagnosis using cell-free DNA (cfDNA) has potential to improve treatment and survival but has several technical limitations. Here, we show that tumor-associated mutations create neomers, DNA sequences 13-17 nucleotides in length that are predominantly absent from genomes of healthy individuals, that can accurately detect cancer, including early stages, and distinguish subtypes and features. Using a neomer-based classifier, we show that we can distinguish twenty-one different tumor-types with higher accuracy than state-of-the-art methods. Refinement of this classifier using a handcrafted set of kmers identified additional cancer features with greater precision. Generation and analysis of 451 cfDNA whole-genome sequences demonstrates that neomers can precisely detect lung and ovarian cancer with an area under the curve (AUC) of 0.93 and 0.89, respectively. In particular, for early stages, we show that neomers can detect lung cancer with an AUC of 0.94 and ovarian cancer, which lacks an early detection test, with an AUC of 0.93. Finally, testing over 9,000 sequences with either promoter or massively parallel reporter assays, we show that neomers can identify cancer-associated mutations that alter regulatory activity. Combined, our results identify a novel, sensitive, specific and simple diagnostic tool that can also identify novel cancer-associated mutations in gene regulatory elements.

Details

Database :
OpenAIRE
Accession number :
edsair.doi...........b41595f014c8033fa604d069951765ac
Full Text :
https://doi.org/10.1101/2021.08.15.21261805