Back to Search
Start Over
Accurate spliced alignment of long RNA sequencing reads
- Source :
- Bioinformatics
- Publication Year :
- 2020
- Publisher :
- Cold Spring Harbor Laboratory, 2020.
-
Abstract
- Motivation Long-read RNA sequencing technologies are establishing themselves as the primary techniques to detect novel isoforms, and many such analyses are dependent on read alignments. However, the error rate and sequencing length of the reads create new challenges for accurately aligning them, particularly around small exons. Results We present an alignment method uLTRA for long RNA sequencing reads based on a novel two-pass collinear chaining algorithm. We show that uLTRA produces higher accuracy over state-of-the-art aligners with substantially higher accuracy for small exons on simulated and synthetic data. On simulated data, uLTRA achieves an accuracy of about 60% for exons of length 10 nucleotides or smaller and close to 90% accuracy for exons of length between 11 and 20 nucleotides. On biological data where true read location is unknown, we show several examples where uLTRA aligns to known and novel isoforms containing small exons that are not detected with other aligners. While uLTRA obtains its accuracy using annotations, it can also be used as a wrapper around minimap2 to align reads outside annotated regions. Availabilityand implementation uLTRA is available at https://github.com/ksahlin/ultra. Supplementary information Supplementary data are available at Bioinformatics online.
- Subjects :
- Statistics and Probability
AcademicSubjects/SCI01060
Computer science
Word error rate
Computational biology
Biology
Biochemistry
Genome
DISEASE
Synthetic data
Transcriptome
03 medical and health sciences
Exon
0302 clinical medicine
splice
Molecular Biology
030304 developmental biology
Supplementary data
0303 health sciences
Biological data
RNA
113 Computer and information sciences
Genome Analysis
Original Papers
Computer Science Applications
Computational Mathematics
Computational Theory and Mathematics
Simulated data
Chaining
HISAT
030217 neurology & neurosurgery
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- Bioinformatics
- Accession number :
- edsair.doi.dedup.....dfad8c8aee7ec7a3278ec56f5430362d