1. A Model of Indel Evolution by Finite-State, Continuous-Time Machines
- Author
-
Ian Holmes
- Subjects
automata ,Differential equation ,Finite Element Analysis ,Markov process ,Investigations ,Biology ,Time ,Evolution, Molecular ,General Relativity and Quantum Cosmology ,symbols.namesake ,INDEL Mutation ,Expectation–maximization algorithm ,Genetics ,State space ,Applied mathematics ,Hidden Markov model ,Population and Evolutionary Genetics ,Models, Genetic ,hidden Markov models ,molecular evolution ,Markov processes ,Markov Chains ,phylogenetics ,Ordinary differential equation ,indels ,symbols ,Automata theory ,Probability distribution ,Developmental Biology - Abstract
How do instantaneous rate models of insertion-deletion processes relate to distributions over pairwise sequence alignments? The only exactly-solved model is the 1991 Thorne....., We introduce a systematic method of approximating finite-time transition probabilities for continuous-time insertion-deletion models on sequences. The method uses automata theory to describe the action of an infinitesimal evolutionary generator on a probability distribution over alignments, where both the generator and the alignment distribution can be represented by pair hidden Markov models (HMMs). In general, combining HMMs in this way induces a multiplication of their state spaces; to control this, we introduce a coarse-graining operation to keep the state space at a constant size. This leads naturally to ordinary differential equations for the evolution of the transition probabilities of the approximating pair HMM. The TKF91 model emerges as an exact solution to these equations for the special case of single-residue indels. For the more general case of multiple-residue indels, the equations can be solved by numerical integration. Using simulated data, we show that the resulting distribution over alignments, when compared to previous approximations, is a better fit over a broader range of parameters. We also propose a related approach to develop differential equations for sufficient statistics to estimate the underlying instantaneous indel rates by expectation maximization. Our code and data are available at https://github.com/ihh/trajectory-likelihood.
- Published
- 2020
- Full Text
- View/download PDF