Back to Search
Start Over
Weakly supervised learning of RNA modifications from low-resolution epitranscriptome data
- Source :
- Bioinformatics, BIOINFORMATICS
- Publication Year :
- 2021
- Publisher :
- Oxford University Press (OUP), 2021.
-
Abstract
- Motivation Increasing evidence suggests that post-transcriptional ribonucleic acid (RNA) modifications regulate essential biomolecular functions and are related to the pathogenesis of various diseases. Precise identification of RNA modification sites is essential for understanding the regulatory mechanisms of RNAs. To date, many computational approaches for predicting RNA modifications have been developed, most of which were based on strong supervision enabled by base-resolution epitranscriptome data. However, high-resolution data may not be available. Results We propose WeakRM, the first weakly supervised learning framework for predicting RNA modifications from low-resolution epitranscriptome datasets, such as those generated from acRIP-seq and hMeRIP-seq. Evaluations on three independent datasets (corresponding to three different RNA modification types and their respective sequencing technologies) demonstrated the effectiveness of our approach in predicting RNA modifications from low-resolution data. WeakRM outperformed state-of-the-art multi-instance learning methods for genomic sequences, such as WSCNN, which was originally designed for transcription factor binding site prediction. Additionally, our approach captured motifs that are consistent with existing knowledge, and visualization of the predicted modification-containing regions unveiled the potentials of detecting RNA modifications with improved resolution. Availability implementation The source code for the WeakRM algorithm, along with the datasets used, are freely accessible at: https://github.com/daiyun02211/WeakRM Supplementary information Supplementary data are available at Bioinformatics online.
- Subjects :
- Statistics and Probability
Source code
AcademicSubjects/SCI01060
Sequence analysis
Computer science
media_common.quotation_subject
Computational biology
Biochemistry
03 medical and health sciences
0302 clinical medicine
Molecular Biology
030304 developmental biology
media_common
0303 health sciences
Sequence Analysis, RNA
Low resolution
Supervised learning
RNA
Macromolecular Sequence, Structure, and Function
Computer Science Applications
Visualization
DNA binding site
Computational Mathematics
Identification (information)
Computational Theory and Mathematics
Supervised Machine Learning
Algorithms
Software
030217 neurology & neurosurgery
Protein Binding
Subjects
Details
- ISSN :
- 14602059 and 13674803
- Volume :
- 37
- Database :
- OpenAIRE
- Journal :
- Bioinformatics
- Accession number :
- edsair.doi.dedup.....8c63897e56610cd80c19caf608411d71