Back to Search
Start Over
McSplicer: a probabilistic model for estimating splice site usage from RNA-seq data
- Source :
- Bioinformatics
- Publication Year :
- 2021
- Publisher :
- Oxford University Press (OUP), 2021.
-
Abstract
- Motivation Alternative splicing removes intronic sequences from pre-mRNAs in alternative ways to produce different forms (isoforms) of mature mRNA. The composition of expressed transcripts gives specific functionalities to cells in a particular condition or developmental stage. In addition, a large fraction of human disease mutations affect splicing and lead to aberrant mRNA and protein products. Current methods that interrogate the transcriptome based on RNA-seq either suffer from short-read length when trying to infer full-length transcripts, or are restricted to predefined units of alternative splicing that they quantify from local read evidence. Results Instead of attempting to quantify individual outcomes of the splicing process such as local splicing events or full-length transcripts, we propose to quantify alternative splicing using a simplified probabilistic model of the underlying splicing process. Our model is based on the usage of individual splice sites and can generate arbitrarily complex types of splicing patterns. In our implementation, McSplicer, we estimate the parameters of our model using all read data at once and we demonstrate in our experiments that this yields more accurate estimates compared to competing methods. Our model is able to describe multiple effects of splicing mutations using few, easy to interpret parameters, as we illustrate in an experiment on RNA-seq data from autism spectrum disorder patients. Availability and implementation McSplicer source code is available at https://github.com/canzarlab/McSplicer and has been deposited in archived format at https://doi.org/10.5281/zenodo.4449881. Supplementary information Supplementary data are available at Bioinformatics online.
- Subjects :
- Statistics and Probability
Gene isoform
Source code
AcademicSubjects/SCI01060
Mature messenger RNA
Computer science
media_common.quotation_subject
Gene Expression
RNA-Seq
Computational biology
Biochemistry
Transcriptome
03 medical and health sciences
0302 clinical medicine
Alternative splicing, RNA-seq, Markov chain
Molecular Biology
030304 developmental biology
media_common
0303 health sciences
Messenger RNA
Markov chain
Alternative splicing
Statistical model
Original Papers
Computer Science Applications
Computational Mathematics
Computational Theory and Mathematics
RNA splicing
030217 neurology & neurosurgery
Subjects
Details
- ISSN :
- 14602059 and 13674803
- Database :
- OpenAIRE
- Journal :
- Bioinformatics
- Accession number :
- edsair.doi.dedup.....31c32288a8cecd1f67f9f3ac2c626686
- Full Text :
- https://doi.org/10.1093/bioinformatics/btab050