Back to Search
Start Over
A naïve Bayesian classifier for identifying plant microRNAs
- Source :
- The Plant Journal. 86:481-492
- Publication Year :
- 2016
- Publisher :
- Wiley, 2016.
-
Abstract
- MicroRNAs (miRNAs) are important regulatory molecules in eukaryotic organisms. Existing methods for the identification of mature miRNA sequences in plants rely extensively on the search for stem-loop structures, leading to high false negative rates. Here, we describe a probabilistic method for ranking putative plant miRNAs using a naïve Bayes classifier and its publicly available implementation. We use a number of properties to construct the classifier, including sequence length, number of observations, existence of detectable predicted miRNA* sequences, the distribution of nearby reads and mapping multiplicity. We apply the method to small RNA sequence data from soybean, peach, Arabidopsis and rice and provide experimental validation of several predictions in soybean. The approach performs well overall and strongly enriches for known miRNAs over other types of sequences. By utilizing a Bayesian approach to rank putative miRNAs, our method is able to score miRNAs that would be eliminated by other methods, such as those that have low counts or lack detectable miRNA* sequences. As a result, we are able to detect several soybean miRNA candidates, including some that are 24 nucleotides long, a class that is almost universally eliminated by other methods.
- Subjects :
- 0301 basic medicine
Small RNA
Base Sequence
Bayesian probability
Computational Biology
Bayes Theorem
Cell Biology
Plant Science
Computational biology
Biology
Bioinformatics
biology.organism_classification
Bayesian statistics
MicroRNAs
03 medical and health sciences
Bayes' theorem
Naive Bayes classifier
030104 developmental biology
Probabilistic method
Gene Expression Regulation, Plant
RNA, Plant
Arabidopsis
Genetics
Classifier (UML)
Subjects
Details
- ISSN :
- 09607412
- Volume :
- 86
- Database :
- OpenAIRE
- Journal :
- The Plant Journal
- Accession number :
- edsair.doi.dedup.....bd468d9bcd4de195a6e9fc6ffdb92497
- Full Text :
- https://doi.org/10.1111/tpj.13180