1. BBmix: a Bayesian Beta-Binomial mixture model for accurate genotyping from RNA-sequencing
- Author
-
Vigorito, Elena, Barton, Anne, Pitzalis, Costantino, Lewis, Myles J, Wallace, Chris, Vigorito, Elena [0000-0001-6230-3849], Barton, Anne [0000-0003-3316-2527], Lewis, Myles J [0000-0001-9365-5345], Wallace, Chris [0000-0001-9755-1703], and Apollo - University of Cambridge Repository
- Subjects
Genotype ,Sequence Analysis, RNA ,High-Throughput Nucleotide Sequencing ,RNA ,Bayes Theorem ,Software - Abstract
MotivationWhile many pipelines have been developed for calling genotypes using RNA-sequencing data, they all have adapted DNA genotype callers that do not model biases specific to RNA-sequencing such as reference panel bias or allele specific expression.ResultsHere, we present BBmix, a Bayesian Beta-Binomial mixture model that first learns the expected distribution of read counts for each genotype, and then deploys those learned parameters to call genotypes probabilistically. We benchmarked our model on a wide variety of datasets and showed that our method generally performed better than competitors, mainly due to an increase of up to 1.4% in the accuracy of heterozygous calls. Moreover, BBmix can be easily incorporated into standard pipelines for calling genotypes. We further show that parameters are generally transferable within datasets, such that a single learning run of less than one hour is sufficient to call genotypes in a large number of samples.AvailabilityWe implemented BBmix as an R package that is available for free under a GPL-2 licence athttps://gitlab.com/evigorito/bbmixand accompanying pipeline athttps://gitlab.com/evigorito/bbmix_pipeline.
- Published
- 2022