Back to Search Start Over

A naïve Bayesian classifier for identifying plant microRNAs

Authors :
Matteo Pellegrini
Robert B. Goldberg
Stephen Douglass
Shawn J. Cokus
John J. Harada
Ssu-Wei Hsu
Source :
The Plant Journal. 86:481-492
Publication Year :
2016
Publisher :
Wiley, 2016.

Abstract

MicroRNAs (miRNAs) are important regulatory molecules in eukaryotic organisms. Existing methods for the identification of mature miRNA sequences in plants rely extensively on the search for stem-loop structures, leading to high false negative rates. Here, we describe a probabilistic method for ranking putative plant miRNAs using a naïve Bayes classifier and its publicly available implementation. We use a number of properties to construct the classifier, including sequence length, number of observations, existence of detectable predicted miRNA* sequences, the distribution of nearby reads and mapping multiplicity. We apply the method to small RNA sequence data from soybean, peach, Arabidopsis and rice and provide experimental validation of several predictions in soybean. The approach performs well overall and strongly enriches for known miRNAs over other types of sequences. By utilizing a Bayesian approach to rank putative miRNAs, our method is able to score miRNAs that would be eliminated by other methods, such as those that have low counts or lack detectable miRNA* sequences. As a result, we are able to detect several soybean miRNA candidates, including some that are 24 nucleotides long, a class that is almost universally eliminated by other methods.

Details

ISSN :
09607412
Volume :
86
Database :
OpenAIRE
Journal :
The Plant Journal
Accession number :
edsair.doi.dedup.....bd468d9bcd4de195a6e9fc6ffdb92497
Full Text :
https://doi.org/10.1111/tpj.13180