1. Comparative analysis of 7 short-read sequencing platforms using the Korean Reference Genome: MGI and Illumina sequencing benchmark for whole-genome sequencing
- Author
-
Oksung Chung, Hak-Min Kim, Youngseok Yu, Dan Bolser, Asta Blazyte, Hui-Su Kim, Jong Bhak, Sungwon Jeon, Yun Sung Cho, Hwang-Yeol Lee, and Je Hoon Jun
- Subjects
dbSNP ,AcademicSubjects/SCI02254 ,Health Informatics ,Genomics ,Computational biology ,Biology ,Data Note ,03 medical and health sciences ,0302 clinical medicine ,Republic of Korea ,Humans ,Genotyping ,Illumina dye sequencing ,030304 developmental biology ,Whole genome sequencing ,0303 health sciences ,Whole Genome Sequencing ,Genome, Human ,High-Throughput Nucleotide Sequencing ,Sequence Analysis, DNA ,DNBSEQ-T7 ,sequencing platform comparison ,Computer Science Applications ,SNP genotyping ,Benchmarking ,whole-genome sequencing ,AcademicSubjects/SCI00960 ,Human genome ,030217 neurology & neurosurgery ,Reference genome - Abstract
Background DNBSEQ-T7 is a new whole-genome sequencer developed by Complete Genomics and MGI using DNA nanoball and combinatorial probe anchor synthesis technologies to generate short reads at a very large scale—up to 60 human genomes per day. However, it has not been objectively and systematically compared against Illumina short-read sequencers. Findings By using the same KOREF sample, the Korean Reference Genome, we have compared 7 sequencing platforms including BGISEQ-500, DNBSEQ-T7, HiSeq2000, HiSeq2500, HiSeq4000, HiSeqX10, and NovaSeq6000. We measured sequencing quality by comparing sequencing statistics (base quality, duplication rate, and random error rate), mapping statistics (mapping rate, depth distribution, and percent GC coverage), and variant statistics (transition/transversion ratio, dbSNP annotation rate, and concordance rate with single-nucleotide polymorphism [SNP] genotyping chip) across the 7 sequencing platforms. We found that MGI platforms showed a higher concordance rate for SNP genotyping than HiSeq2000 and HiSeq4000. The similarity matrix of variant calls confirmed that the 2 MGI platforms have the most similar characteristics to the HiSeq2500 platform. Conclusions Overall, MGI and Illumina sequencing platforms showed comparable levels of sequencing quality, uniformity of coverage, percent GC coverage, and variant accuracy; thus we conclude that the MGI platforms can be used for a wide range of genomics research fields at a lower cost than the Illumina platforms.
- Published
- 2020