Back to Search Start Over

Dark-matter matters: Discriminating subtle blood cancers using the darkest DNA.

Authors :
Parida, Laxmi
Haferlach, Claudia
Rhrissorrakrai, Kahn
Utro, Filippo
Levovitz, Chaya
Kern, Wolfgang
Nadarajah, Niroshan
Twardziok, Sven
Hutter, Stephan
Meggendorfer, Manja
Walter, Wencke
Baer, Constance
Haferlach, Torsten
Source :
PLoS Computational Biology; 8/30/2019, Vol. 15 Issue 8, p1-12, 12p, 2 Diagrams
Publication Year :
2019

Abstract

The confluence of deep sequencing and powerful machine learning is providing an unprecedented peek at the darkest of the dark genomic matter, the non-coding genomic regions lacking any functional annotation. While deep sequencing uncovers rare tumor variants, the heterogeneity of the disease confounds the best of machine learning (ML) algorithms. Here we set out to answer if the dark-matter of the genome encompass signals that can distinguish the fine subtypes of disease that are otherwise gnomically indistinguishable. We introduce a novel stochastic regularization, ReVeaL, that empowers ML to discriminate subtle cancer subtypes even from the same ‘cell of origin’. Analogous to heritability, implicitly defined on whole genome, we use predictability (F<subscript>1</subscript> score) definable on portions of the genome. In an effort to distinguish cancer subtypes using dark-matter DNA, we applied ReVeaL to a new WGS dataset from 727 patient samples with seven forms of hematological cancers and assessed the predictivity over several genomic regions including genic, non-dark, non-coding, non-genic, and dark. ReVeaL enabled improved discrimination of cancer subtypes for all segments of the genome. The non-genic, non-coding and dark-matter had the highest F<subscript>1</subscript> scores, with dark-matter having the highest level of predictability. Based on ReVeaL’s predictability of different genomic regions, dark-matter contains enough signal to significantly discriminate fine subtypes of disease. Hence, the agglomeration of rare variants, even in the hitherto unannotated and ill-understood regions of the genome, may play a substantial role in the disease etiology and deserve much more attention. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
1553734X
Volume :
15
Issue :
8
Database :
Complementary Index
Journal :
PLoS Computational Biology
Publication Type :
Academic Journal
Accession number :
138380152
Full Text :
https://doi.org/10.1371/journal.pcbi.1007332