201. SquiggleNet: real-time, direct classification of nanopore signals
- Author
-
Yuwei Bao, Torrin L. McDonald, Robert P. Dickson, Joshua D. Welch, David Blaauw, Jack Wadden, Weichen Zhou, Ryan E. Mills, Piyush Ranjan, Alan P. Boyle, and John R. Erb-Downward
- Subjects
DNA, Bacterial ,QH301-705.5 ,Interspersed repeat ,Respiratory System ,Method ,Sequence alignment ,QH426-470 ,Biology ,Genome ,Raw signal ,Deep Learning ,Classifier (linguistics) ,Genetics ,Humans ,Biology (General) ,Read-until ,business.industry ,Deep learning ,Pattern recognition ,Nanopore ,Nanopore Sequencing ,Long Interspersed Nucleotide Elements ,Oxford Nanopore ,Metagenome ,Base calling ,Nanopore sequencing ,Artificial intelligence ,business ,Real-time - Abstract
We present SquiggleNet, the first deep-learning model that can classify nanopore reads directly from their electrical signals. SquiggleNet operates faster than DNA passes through the pore, allowing real-time classification and read ejection. Using 1 s of sequencing data, the classifier achieves significantly higher accuracy than base calling followed by sequence alignment. Our approach is also faster and requires an order of magnitude less memory than alignment-based approaches. SquiggleNet distinguished human from bacterial DNA with over 90% accuracy, generalized to unseen bacterial species in a human respiratory meta genome sample, and accurately classified sequences containing human long interspersed repeat elements. Supplementary Information The online version contains supplementary material available at (10.1186/s13059-021-02511-y).
- Published
- 2021