Back to Search Start Over

Characterization and remediation of sample index swaps by non-redundant dual indexing on massively parallel sequencing platforms

Authors :
Maura Costello
Mark Fleharty
Justin Abreu
Yossi Farjoun
Steven Ferriera
Laurie Holmes
Brian Granger
Lisa Green
Tom Howd
Tamara Mason
Gina Vicente
Michael Dasilva
Wendy Brodeur
Timothy DeSmet
Sheila Dodge
Niall J. Lennon
Stacey Gabriel
Source :
BMC Genomics, Vol 19, Iss 1, Pp 1-10 (2018)
Publication Year :
2018
Publisher :
BMC, 2018.

Abstract

Abstract Background Here we present an in-depth characterization of the mechanism of sequencer-induced sample contamination due to the phenomenon of index swapping that impacts Illumina sequencers employing patterned flow cells with Exclusion Amplification (ExAmp) chemistry (HiSeqX, HiSeq4000, and NovaSeq). We also present a remediation method that minimizes the impact of such swaps. Results Leveraging data collected over a two-year period, we demonstrate the widespread prevalence of index swapping in patterned flow cell data. We calculate mean swap rates across multiple sample preparation methods and sequencer models, demonstrating that different library methods can have vastly different swapping rates and that even non-ExAmp chemistry instruments display trace levels of index swapping. We provide methods for eliminating sample data cross contamination by utilizing non-redundant dual indexing for complete filtering of index swapped reads, and share the sequences for 96 non-combinatorial dual indexes we have validated across various library preparation methods and sequencer models. Finally, using computational methods we provide a greater insight into the mechanism of index swapping. Conclusions Index swapping in pooled libraries is a prevalent phenomenon that we observe at a rate of 0.2 to 6% in all sequencing runs on HiSeqX, HiSeq 4000/3000, and NovaSeq. Utilizing non-redundant dual indexing allows for the removal (flagging/filtering) of these swapped reads and eliminates swapping induced sample contamination, which is critical for sensitive applications such as RNA-seq, single cell, blood biopsy using circulating tumor DNA, or clinical sequencing.

Details

Language :
English
ISSN :
14712164
Volume :
19
Issue :
1
Database :
Directory of Open Access Journals
Journal :
BMC Genomics
Publication Type :
Academic Journal
Accession number :
edsdoj.2079b2d7551942bc810c187c8e5c737e
Document Type :
article
Full Text :
https://doi.org/10.1186/s12864-018-4703-0