Back to Search Start Over

Ultrafast prediction of somatic structural variations by filtering out reads matched to pan-genome k-mer sets

Authors :
Jang-il Sohn
Min-Hak Choi
Dohun Yi
Vipin A. Menon
Yeon Jeong Kim
Junehawk Lee
Jung Woo Park
Sungkyu Kyung
Seung-Ho Shin
Byunggook Na
Je-Gun Joung
Young Seok Ju
Min Sun Yeom
Youngil Koh
Sung-Soo Yoon
Daehyun Baek
Tae-Min Kim
Jin-Wu Nam
Source :
Nature biomedical engineering.
Publication Year :
2021

Abstract

Variant callers typically produce massive numbers of false positives for structural variations, such as cancer-relevant copy-number alterations and fusion genes resulting from genome rearrangements. Here we describe an ultrafast and accurate detector of somatic structural variations that reduces read-mapping costs by filtering out reads matched to pan-genome k-mer sets. The detector, which we named ETCHING (for efficient detection of chromosomal rearrangements and fusion genes), reduces the number of false positives by leveraging machine-learning classifiers trained with six breakend-related features (clipped-read count, split-reads count, supporting paired-end read count, average mapping quality, depth difference and total length of clipped bases). When benchmarked against six callers on reference cell-free DNA, validated biomarkers of structural variants, matched tumour and normal whole genomes, and tumour-only targeted sequencing datasets, ETCHING was 11-fold faster than the second-fastest structural-variant caller at comparable performance and memory use. The speed and accuracy of ETCHING may aid large-scale genome projects and facilitate practical implementations in precision medicine.

Details

ISSN :
2157846X
Database :
OpenAIRE
Journal :
Nature biomedical engineering
Accession number :
edsair.doi.dedup.....fbd5e4c689e663433873a140ae16ca80