89 results on '"Lartillot N"'
Search Results
2. Fast optimization of statistical potentials for structurally constrained phylogenetic models
- Author
-
Rodrigue Nicolas, Kleinman Claudia L, Bonnard Cécile, and Lartillot Nicolas
- Subjects
Evolution ,QH359-425 - Abstract
Abstract Background Statistical approaches for protein design are relevant in the field of molecular evolutionary studies. In recent years, new, so-called structurally constrained (SC) models of protein-coding sequence evolution have been proposed, which use statistical potentials to assess sequence-structure compatibility. In a previous work, we defined a statistical framework for optimizing knowledge-based potentials especially suited to SC models. Our method used the maximum likelihood principle and provided what we call the joint potentials. However, the method required numerical estimations by the use of computationally heavy Markov Chain Monte Carlo sampling algorithms. Results Here, we develop an alternative optimization procedure, based on a leave-one-out argument coupled to fast gradient descent algorithms. We assess that the leave-one-out potential yields very similar results to the joint approach developed previously, both in terms of the resulting potential parameters, and by Bayes factor evaluation in a phylogenetic context. On the other hand, the leave-one-out approach results in a considerable computational benefit (up to a 1,000 fold decrease in computational time for the optimization procedure). Conclusion Due to its computational speed, the optimization method we propose offers an attractive alternative for the design and empirical evaluation of alternative forms of potentials, using large data sets and high-dimensional parameterizations.
- Published
- 2009
- Full Text
- View/download PDF
3. Evaluation of the models handling heterotachy in phylogenetic inference
- Author
-
Philippe Hervé, Lartillot Nicolas, Rodrigue Nicolas, and Zhou Yan
- Subjects
Evolution ,QH359-425 - Abstract
Abstract Background The evolutionary rate at a given homologous position varies across time. When sufficiently pronounced, this phenomenon – called heterotachy – may produce artefactual phylogenetic reconstructions under the commonly used models of sequence evolution. These observations have motivated the development of models that explicitly recognize heterotachy, with research directions proposed along two main axes: 1) the covarion approach, where sites switch from variable to invariable states; and 2) the mixture of branch lengths (MBL) approach, where alignment patterns are assumed to arise from one of several sets of branch lengths, under a given phylogeny. Results Here, we report the first statistical comparisons contrasting the performance of covarion and MBL modeling strategies. Using simulations under heterotachous conditions, we explore the properties of three model comparison methods: the Akaike information criterion, the Bayesian information criterion, and cross validation. Although more time consuming, cross validation appears more reliable than AIC and BIC as it directly measures the predictive power of a model on 'future' data. We also analyze three large datasets (nuclear proteins of animals, mitochondrial proteins of mammals, and plastid proteins of plants), and find the optimal number of components of the MBL model to be two for all datasets, indicating that this model is preferred over the standard homogeneous model. However, the covarion model is always favored over the optimal MBL model. Conclusion We demonstrated, using three large datasets, that the covarion model is more efficient at handling heterotachy than the MBL model. This is probably due to the fact that the MBL model requires a serious increase in the number of parameters, as compared to two supplementary parameters of the covarion approach. Further improvements of the both the mixture and the covarion approaches might be obtained by modeling heterogeneous behavior both along time and across sites.
- Published
- 2007
- Full Text
- View/download PDF
4. A maximum likelihood framework for protein design
- Author
-
Philippe Hervé, Bonnard Cécile, Rodrigue Nicolas, Kleinman Claudia L, and Lartillot Nicolas
- Subjects
Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background The aim of protein design is to predict amino-acid sequences compatible with a given target structure. Traditionally envisioned as a purely thermodynamic question, this problem can also be understood in a wider context, where additional constraints are captured by learning the sequence patterns displayed by natural proteins of known conformation. In this latter perspective, however, we still need a theoretical formalization of the question, leading to general and efficient learning methods, and allowing for the selection of fast and accurate objective functions quantifying sequence/structure compatibility. Results We propose a formulation of the protein design problem in terms of model-based statistical inference. Our framework uses the maximum likelihood principle to optimize the unknown parameters of a statistical potential, which we call an inverse potential to contrast with classical potentials used for structure prediction. We propose an implementation based on Markov chain Monte Carlo, in which the likelihood is maximized by gradient descent and is numerically estimated by thermodynamic integration. The fit of the models is evaluated by cross-validation. We apply this to a simple pairwise contact potential, supplemented with a solvent-accessibility term, and show that the resulting models have a better predictive power than currently available pairwise potentials. Furthermore, the model comparison method presented here allows one to measure the relative contribution of each component of the potential, and to choose the optimal number of accessibility classes, which turns out to be much higher than classically considered. Conclusion Altogether, this reformulation makes it possible to test a wide diversity of models, using different forms of potentials, or accounting for other factors than just the constraint of thermodynamic stability. Ultimately, such model-based statistical analyses may help to understand the forces shaping protein sequences, and driving their evolution.
- Published
- 2006
- Full Text
- View/download PDF
5. Genome Streamlining: Effect of Mutation Rate and Population Size on Genome Size Reduction.
- Author
-
Luiselli J, Rouzaud-Cornabas J, Lartillot N, and Beslon G
- Subjects
- Evolution, Molecular, Models, Genetic, Bacteria genetics, Genome Size, Mutation Rate, Genome, Bacterial, Population Density
- Abstract
Genome streamlining, i.e. genome size reduction, is observed in bacteria with very different life traits, including endosymbiotic bacteria and several marine bacteria, raising the question of its evolutionary origin. None of the hypotheses proposed in the literature is firmly established, mainly due to the many confounding factors related to the diverse habitats of species with streamlined genomes. Computational models may help overcome these difficulties and rigorously test hypotheses. In this work, we used Aevol, a platform designed to study the evolution of genome architecture, to test 2 main hypotheses: that an increase in population size (N) or mutation rate (μ) could cause genome reduction. In our experiments, both conditions lead to streamlining but have very different resulting genome structures. Under increased population sizes, genomes lose a significant fraction of noncoding sequences but maintain their coding size, resulting in densely packed genomes (akin to streamlined marine bacteria genomes). By contrast, under an increased mutation rate, genomes lose both coding and noncoding sequences (akin to endosymbiotic bacteria genomes). Hence, both factors lead to an overall reduction in genome size, but the coding density of the genome appears to be determined by N×μ. Thus, a broad range of genome size and density can be achieved by different combinations of N and μ. Our results suggest that genome size and coding density are determined by the interplay between selection for phenotypic adaptation and selection for robustness., Competing Interests: Conflict of interest The authors declare no competing interests., (© The Author(s) 2024. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution.)
- Published
- 2024
- Full Text
- View/download PDF
6. Imbalanced speciation pulses sustain the radiation of mammals.
- Author
-
Quintero I, Lartillot N, and Morlon H
- Subjects
- Animals, Biodiversity, Extinction, Biological, Fossils, Genetic Speciation, Mammals classification, Mammals genetics, Phylogeny
- Abstract
The evolutionary histories of major clades, including mammals, often comprise changes in their diversification dynamics, but how these changes occur remains debated. We combined comprehensive phylogenetic and fossil information in a new "birth-death diffusion" model that provides a detailed characterization of variation in diversification rates in mammals. We found an early rising and sustained diversification scenario, wherein speciation rates increased before and during the Cretaceous-Paleogene (K-Pg) boundary. The K-Pg mass extinction event filtered out more slowly speciating lineages and was followed by a subsequent slowing in speciation rates rather than rebounds. These dynamics arose from an imbalanced speciation process, with separate lineages giving rise to many, less speciation-prone descendants. Diversity seems to have been brought about by these isolated, fast-speciating lineages, rather than by a few punctuated innovations.
- Published
- 2024
- Full Text
- View/download PDF
7. Bridging the gap between the evolutionary dynamics and the molecular mechanisms of meiosis: A model based exploration of the PRDM9 intra-genomic Red Queen.
- Author
-
Genestier A, Duret L, and Lartillot N
- Subjects
- Animals, Mice, Gene Conversion, DNA Breaks, Double-Stranded, Alleles, Models, Genetic, Humans, Recombination, Genetic, Histone-Lysine N-Methyltransferase genetics, Histone-Lysine N-Methyltransferase metabolism, Meiosis genetics, Evolution, Molecular
- Abstract
Molecular dissection of meiotic recombination in mammals, combined with population-genetic and comparative studies, have revealed a complex evolutionary dynamic characterized by short-lived recombination hotspots. Hotspots are chromosome positions containing DNA sequences where the protein PRDM9 can bind and cause crossing-over. To explain these fast evolutionary dynamic, a so-called intra-genomic Red Queen model has been proposed, based on the interplay between two antagonistic forces: biased gene conversion, mediated by double-strand breaks, resulting in hotspot extinction (the hotspot conversion paradox), followed by positive selection favoring mutant PRDM9 alleles recognizing new sequence motifs. Although this model predicts many empirical observations, the exact causes of the positive selection acting on new PRDM9 alleles is still not well understood. In this direction, experiment on mouse hybrids have suggested that, in addition to targeting double strand breaks, PRDM9 has another role during meiosis. Specifically, PRDM9 symmetric binding (simultaneous binding at the same site on both homologues) would facilitate homology search and, as a result, the pairing of the homologues. Although discovered in hybrids, this second function of PRDM9 could also be involved in the evolutionary dynamic observed within populations. To address this point, here, we present a theoretical model of the evolutionary dynamic of meiotic recombination integrating current knowledge about the molecular function of PRDM9. Our modeling work gives important insights into the selective forces driving the turnover of recombination hotspots. Specifically, the reduced symmetrical binding of PRDM9 caused by the loss of high affinity binding sites induces a net positive selection eliciting new PRDM9 alleles recognizing new targets. The model also offers new insights about the influence of the gene dosage of PRDM9, which can paradoxically result in negative selection on new PRDM9 alleles entering the population, driving their eviction and thus reducing standing variation at this locus., Competing Interests: The authors have declared that no competing interests exist., (Copyright: © 2024 Genestier et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)
- Published
- 2024
- Full Text
- View/download PDF
8. Compositionally Constrained Sites Drive Long-Branch Attraction.
- Author
-
Szánthó LL, Lartillot N, Szöllősi GJ, and Schrempf D
- Subjects
- Animals, Phylogeny, Bias, Models, Genetic, Microsporidia
- Abstract
Accurate phylogenies are fundamental to our understanding of the pattern and process of evolution. Yet, phylogenies at deep evolutionary timescales, with correspondingly long branches, have been fraught with controversy resulting from conflicting estimates from models with varying complexity and goodness of fit. Analyses of historical as well as current empirical datasets, such as alignments including Microsporidia, Nematoda, or Platyhelminthes, have demonstrated that inadequate modeling of across-site compositional heterogeneity, which is the result of biochemical constraints that lead to varying patterns of accepted amino acids along sequences, can lead to erroneous topologies that are strongly supported. Unfortunately, models that adequately account for across-site compositional heterogeneity remain computationally challenging or intractable for an increasing fraction of contemporary datasets. Here, we introduce "compositional constraint analysis," a method to investigate the effect of site-specific constraints on amino acid composition on phylogenetic inference. We show that more constrained sites with lower diversity and less constrained sites with higher diversity exhibit ostensibly conflicting signals under models ignoring across-site compositional heterogeneity that lead to long-branch attraction artifacts and demonstrate that more complex models accounting for across-site compositional heterogeneity can ameliorate this bias. We present CAT-posterior mean site frequencies (PMSF), a pipeline for diagnosing and resolving phylogenetic bias resulting from inadequate modeling of across-site compositional heterogeneity based on the CAT model. CAT-PMSF is robust against long-branch attraction in all alignments we have examined. We suggest using CAT-PMSF when convergence of the CAT model cannot be assured. We find evidence that compositionally constrained sites are driving long-branch attraction in two metazoan datasets and recover evidence for Porifera as the sister group to all other animals. [Animal phylogeny; cross-site heterogeneity; long-branch attraction; phylogenomics.]., (© The Author(s) 2023. Published by Oxford University Press on behalf of the Society of Systematic Biologists.)
- Published
- 2023
- Full Text
- View/download PDF
9. Identifying the Best Approximating Model in Bayesian Phylogenetics: Bayes Factors, Cross-Validation or wAIC?
- Author
-
Lartillot N
- Subjects
- Bayes Theorem, Computer Simulation, Probability, Markov Chains, Monte Carlo Method, Phylogeny
- Abstract
There is still no consensus as to how to select models in Bayesian phylogenetics, and more generally in applied Bayesian statistics. Bayes factors are often presented as the method of choice, yet other approaches have been proposed, such as cross-validation or information criteria. Each of these paradigms raises specific computational challenges, but they also differ in their statistical meaning, being motivated by different objectives: either testing hypotheses or finding the best-approximating model. These alternative goals entail different compromises, and as a result, Bayes factors, cross-validation, and information criteria may be valid for addressing different questions. Here, the question of Bayesian model selection is revisited, with a focus on the problem of finding the best-approximating model. Several model selection approaches were re-implemented, numerically assessed and compared: Bayes factors, cross-validation (CV), in its different forms (k-fold or leave-one-out), and the widely applicable information criterion (wAIC), which is asymptotically equivalent to leave-one-out cross-validation (LOO-CV). Using a combination of analytical results and empirical and simulation analyses, it is shown that Bayes factors are unduly conservative. In contrast, CV represents a more adequate formalism for selecting the model returning the best approximation of the data-generating process and the most accurate estimates of the parameters of interest. Among alternative CV schemes, LOO-CV and its asymptotic equivalent represented by the wAIC, stand out as the best choices, conceptually and computationally, given that both can be simultaneously computed based on standard Markov chain Monte Carlo runs under the posterior distribution. [Bayes factor; cross-validation; marginal likelihood; model comparison; wAIC.]., (© The Author(s) 2023. Published by Oxford University Press on behalf of the Society of Systematic Biologists.)
- Published
- 2023
- Full Text
- View/download PDF
10. Genes and sites under adaptation at the phylogenetic scale also exhibit adaptation at the population-genetic scale.
- Author
-
Latrille T, Rodrigue N, and Lartillot N
- Subjects
- Humans, Female, Pregnancy, Animals, Phylogeny, Placenta, Genetics, Population, Codon, Models, Genetic, Mammals genetics, Evolution, Molecular, Selection, Genetic
- Abstract
Adaptation in protein-coding sequences can be detected from multiple sequence alignments across species or alternatively by leveraging polymorphism data within a population. Across species, quantification of the adaptive rate relies on phylogenetic codon models, classically formulated in terms of the ratio of nonsynonymous over synonymous substitution rates. Evidence of an accelerated nonsynonymous substitution rate is considered a signature of pervasive adaptation. However, because of the background of purifying selection, these models are potentially limited in their sensitivity. Recent developments have led to more sophisticated mutation-selection codon models aimed at making a more detailed quantitative assessment of the interplay between mutation, purifying, and positive selection. In this study, we conducted a large-scale exome-wide analysis of placental mammals with mutation-selection models, assessing their performance at detecting proteins and sites under adaptation. Importantly, mutation-selection codon models are based on a population-genetic formalism and thus are directly comparable to the McDonald and Kreitman test at the population level to quantify adaptation. Taking advantage of this relationship between phylogenetic and population genetics analyses, we integrated divergence and polymorphism data across the entire exome for 29 populations across 7 genera and showed that proteins and sites detected to be under adaptation at the phylogenetic scale are also under adaptation at the population-genetic scale. Altogether, our exome-wide analysis shows that phylogenetic mutation-selection codon models and the population-genetic test of adaptation can be reconciled and are congruent, paving the way for integrative models and analyses across individuals and populations.
- Published
- 2023
- Full Text
- View/download PDF
11. An Improved Codon Modeling Approach for Accurate Estimation of the Mutation Bias.
- Author
-
Latrille T and Lartillot N
- Subjects
- Codon genetics, Models, Genetic, Mutation, Phylogeny, Genetic Code, Selection, Genetic
- Abstract
Phylogenetic codon models are routinely used to characterize selective regimes in coding sequences. Their parametric design, however, is still a matter of debate, in particular concerning the question of how to account for differing nucleotide frequencies and substitution rates. This problem relates to the fact that nucleotide composition in protein-coding sequences is the result of the interactions between mutation and selection. In particular, because of the structure of the genetic code, the nucleotide composition differs between the three coding positions, with the third position showing a more extreme composition. Yet, phylogenetic codon models do not correctly capture this phenomenon and instead predict that the nucleotide composition should be the same for all three positions. Alternatively, some models allow for different nucleotide rates at the three positions, an approach conflating the effects of mutation and selection on nucleotide composition. In practice, it results in inaccurate estimation of the strength of selection. Conceptually, the problem comes from the fact that phylogenetic codon models do not correctly capture the fixation bias acting against the mutational pressure at the mutation-selection equilibrium. To address this problem and to more accurately identify mutation rates and selection strength, we present an improved codon modeling approach where the fixation rate is not seen as a scalar, but as a tensor. This approach gives an accurate representation of how mutation and selection oppose each other at equilibrium and yields a reliable estimate of the mutational process, while disentangling the mean fixation probabilities prevailing in different mutational directions., (© The Author(s) 2022. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.)
- Published
- 2022
- Full Text
- View/download PDF
12. Natural Selection beyond Life? A Workshop Report.
- Author
-
Charlat S, Ariew A, Bourrat P, Ferreira Ruiz M, Heams T, Huneman P, Krishna S, Lachmann M, Lartillot N, Le Sergeant d'Hendecourt L, Malaterre C, Nghe P, Rajon E, Rivoire O, Smerlak M, and Zeravcic Z
- Abstract
Natural selection is commonly seen not just as an explanation for adaptive evolution, but as the inevitable consequence of "heritable variation in fitness among individuals". Although it remains embedded in biological concepts, such a formalisation makes it tempting to explore whether this precondition may be met not only in life as we know it, but also in other physical systems. This would imply that these systems are subject to natural selection and may perhaps be investigated in a biological framework, where properties are typically examined in light of their putative functions. Here we relate the major questions that were debated during a three-day workshop devoted to discussing whether natural selection may take place in non-living physical systems. We start this report with a brief overview of research fields dealing with "life-like" or "proto-biotic" systems, where mimicking evolution by natural selection in test tubes stands as a major objective. We contend the challenge may be as much conceptual as technical. Taking the problem from a physical angle, we then discuss the framework of dissipative structures. Although life is viewed in this context as a particular case within a larger ensemble of physical phenomena, this approach does not provide general principles from which natural selection can be derived. Turning back to evolutionary biology, we ask to what extent the most general formulations of the necessary conditions or signatures of natural selection may be applicable beyond biology. In our view, such a cross-disciplinary jump is impeded by reliance on individuality as a central yet implicit and loosely defined concept. Overall, these discussions thus lead us to conjecture that understanding, in physico-chemical terms, how individuality emerges and how it can be recognised, will be essential in the search for instances of evolution by natural selection outside of living systems.
- Published
- 2021
- Full Text
- View/download PDF
13. Inferring Long-Term Effective Population Size with Mutation-Selection Models.
- Author
-
Latrille T, Lanore V, and Lartillot N
- Subjects
- Animals, Bayes Theorem, Evolution, Molecular, Mammals, Mutation, Phylogeny, Population Density, Models, Genetic, Selection, Genetic
- Abstract
Mutation-selection phylogenetic codon models are grounded on population genetics first principles and represent a principled approach for investigating the intricate interplay between mutation, selection, and drift. In their current form, mutation-selection codon models are entirely characterized by the collection of site-specific amino-acid fitness profiles. However, thus far, they have relied on the assumption of a constant genetic drift, translating into a unique effective population size (Ne) across the phylogeny, clearly an unrealistic assumption. This assumption can be alleviated by introducing variation in Ne between lineages. In addition to Ne, the mutation rate (μ) is susceptible to vary between lineages, and both should covary with life-history traits (LHTs). This suggests that the model should more globally account for the joint evolutionary process followed by all of these lineage-specific variables (Ne, μ, and LHTs). In this direction, we introduce an extended mutation-selection model jointly reconstructing in a Bayesian Monte Carlo framework the fitness landscape across sites and long-term trends in Ne, μ, and LHTs along the phylogeny, from an alignment of DNA coding sequences and a matrix of observed LHTs in extant species. The model was tested against simulated data and applied to empirical data in mammals, isopods, and primates. The reconstructed history of Ne in these groups appears to correlate with LHTs or ecological variables in a way that suggests that the reconstruction is reasonable, at least in its global trends. On the other hand, the range of variation in Ne inferred across species is surprisingly narrow. This last point suggests that some of the assumptions of the model, in particular concerning the assumed absence of epistatic interactions between sites, are potentially problematic., (© The Author(s) 2021. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.)
- Published
- 2021
- Full Text
- View/download PDF
14. Reconstructing the History of Variation in Effective Population Size along Phylogenies.
- Author
-
Brevet M and Lartillot N
- Subjects
- Animals, Genetic Variation, Models, Genetic, Mutation Rate, Phylogeny, Population Density, Evolution, Molecular, Selection, Genetic
- Abstract
The nearly neutral theory predicts specific relations between effective population size (Ne) and patterns of divergence and polymorphism, which depend on the shape of the distribution of fitness effects (DFE) of new mutations. However, testing these relations is not straightforward, owing to the difficulty in estimating Ne. Here, we introduce an integrative framework allowing for an explicit reconstruction of the phylogenetic history of Ne, thus leading to a quantitative test of the nearly neutral theory and an estimation of the allometric scaling of the ratios of nonsynonymous over synonymous polymorphism (πN/πS) and divergence (dN/dS) with respect to Ne. As an illustration, we applied our method to primates, for which the nearly neutral predictions were mostly verified. Under a purely nearly neutral model with a constant DFE across species, we find that the variation in πN/πS and dN/dS as a function of Ne is too large to be compatible with current estimates of the DFE based on site frequency spectra. The reconstructed history of Ne shows a 10-fold variation across primates. The mutation rate per generation u, also reconstructed over the tree by the method, varies over a 3-fold range and is negatively correlated with Ne. As a result of these opposing trends for Ne and u, variation in πS is intermediate, primarily driven by Ne but substantially influenced by u. Altogether, our integrative framework provides a quantitative assessment of the role of Ne and u in modulating patterns of genetic variation, while giving a synthetic picture of their history over the clade., (© The Author(s) 2021. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.)
- Published
- 2021
- Full Text
- View/download PDF
15. Erratum to: A Bayesian mutation-selection framework for detecting site-specific adaptive evolution in protein-coding genes.
- Author
-
Rodrigue N, Latrille T, and Lartillot N
- Published
- 2021
- Full Text
- View/download PDF
16. Detecting sex-linked genes using genotyped individuals sampled in natural populations.
- Author
-
Käfer J, Lartillot N, Marais GAB, and Picard F
- Subjects
- Genes, Plant, Genes, X-Linked, Genes, Y-Linked, Haplotypes, Humans, Models, Genetic, Polymorphism, Genetic, Recombination, Genetic, Silene genetics, Chromosome Mapping methods, Chromosomes, Human genetics, Chromosomes, Plant genetics, Sex Chromosomes genetics
- Abstract
We propose a method, SDpop, able to infer sex-linkage caused by recombination suppression typical of sex chromosomes. The method is based on the modeling of the allele and genotype frequencies of individuals of known sex in natural populations. It is implemented in a hierarchical probabilistic framework, accounting for different sources of error. It allows statistical testing for the presence or absence of sex chromosomes, and detection of sex-linked genes based on the posterior probabilities in the model. Furthermore, for gametologous sequences, the haplotype and level of nucleotide polymorphism of each copy can be inferred, as well as the divergence between them. We test the method using simulated data, as well as data from both a relatively recent and an old sex chromosome system (the plant Silene latifolia and humans) and show that, for most cases, robust predictions are obtained with 5 to 10 individuals per sex., (© The Author(s) 2021. Published by Oxford University Press on behalf of Genetics Society of America. All rights reserved. For permissions, please email: journals.permissions@oup.com.)
- Published
- 2021
- Full Text
- View/download PDF
17. Long-Lived Species of Bivalves Exhibit Low MT-DNA Substitution Rates.
- Author
-
Mortz M, Levivier A, Lartillot N, Dufresne F, and Blier PU
- Abstract
Bivalves represent valuable taxonomic group for aging studies given their wide variation in longevity (from 1-2 to >500 years). It is well known that aging is associated to the maintenance of Reactive Oxygen Species homeostasis and that mitochondria phenotype and genotype dysfunctions accumulation is a hallmark of these processes. Previous studies have shown that mitochondrial DNA mutation rates are linked to lifespan in vertebrate species, but no study has explored this in invertebrates. To this end, we performed a Bayesian Phylogenetic Covariance model of evolution analysis using 12 mitochondrial protein-coding genes of 76 bivalve species. Three life history traits (maximum longevity, generation time and mean temperature tolerance) were tested against 1) synonymous substitution rates (dS), 2) conservative amino acid replacement rates (Kc) and 3) ratios of radical over conservative amino acid replacement rates (Kr/Kc). Our results confirm the already known correlation between longevity and generation time and show, for the first time in an invertebrate class, a significant negative correlation between dS and longevity. This correlation was not as strong when generation time and mean temperature tolerance variations were also considered in our model (marginal correlation), suggesting a confounding effect of these traits on the relationship between longevity and mtDNA substitution rate. By confirming the negative correlation between dS and longevity previously documented in birds and mammals, our results provide support for a general pattern in substitution rates., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2021 Mortz, Levivier, Lartillot, Dufresne and Blier.)
- Published
- 2021
- Full Text
- View/download PDF
18. Publisher Correction: Universal probabilistic programming offers a powerful approach to statistical phylogenetics.
- Author
-
Ronquist F, Kudlicka J, Senderov V, Borgström J, Lartillot N, Lundén D, Murray L, Schön TB, and Broman D
- Published
- 2021
- Full Text
- View/download PDF
19. A Bayesian Mutation-Selection Framework for Detecting Site-Specific Adaptive Evolution in Protein-Coding Genes.
- Author
-
Rodrigue N, Latrille T, and Lartillot N
- Subjects
- Bayes Theorem, Biological Evolution, Genetic Techniques, Models, Genetic, Mutation, Selection, Genetic
- Abstract
In recent years, codon substitution models based on the mutation-selection principle have been extended for the purpose of detecting signatures of adaptive evolution in protein-coding genes. However, the approaches used to date have either focused on detecting global signals of adaptive regimes-across the entire gene-or on contexts where experimentally derived, site-specific amino acid fitness profiles are available. Here, we present a Bayesian site-heterogeneous mutation-selection framework for site-specific detection of adaptive substitution regimes given a protein-coding DNA alignment. We offer implementations, briefly present simulation results, and apply the approach on a few real data sets. Our analyses suggest that the new approach shows greater sensitivity than traditional methods. However, more study is required to assess the impact of potential model violations on the method, and gain a greater empirical sense its behavior on a broader range of real data sets. We propose an outline of such a research program., (© The Author(s) 2020. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.)
- Published
- 2021
- Full Text
- View/download PDF
20. Universal probabilistic programming offers a powerful approach to statistical phylogenetics.
- Author
-
Ronquist F, Kudlicka J, Senderov V, Borgström J, Lartillot N, Lundén D, Murray L, Schön TB, and Broman D
- Subjects
- Animals, Bayes Theorem, Birds genetics, Data Interpretation, Statistical, Models, Statistical, Monte Carlo Method, Probability, Programming Languages, Artificial Intelligence, Biological Evolution, Biostatistics, Birds physiology, Phylogeny, Software
- Abstract
Statistical phylogenetic analysis currently relies on complex, dedicated software packages, making it difficult for evolutionary biologists to explore new models and inference strategies. Recent years have seen more generic solutions based on probabilistic graphical models, but this formalism can only partly express phylogenetic problems. Here, we show that universal probabilistic programming languages (PPLs) solve the expressivity problem, while still supporting automated generation of efficient inference algorithms. To prove the latter point, we develop automated generation of sequential Monte Carlo (SMC) algorithms for PPL descriptions of arbitrary biological diversification (birth-death) models. SMC is a new inference strategy for these problems, supporting both parameter inference and efficient estimation of Bayes factors that are used in model testing. We take advantage of this in automatically generating SMC algorithms for several recent diversification models that have been difficult or impossible to tackle previously. Finally, applying these algorithms to 40 bird phylogenies, we show that models with slowing diversification, constant turnover and many small shifts generally explain the data best. Our work opens up several related problem domains to PPL approaches, and shows that few hurdles remain before these techniques can be effectively applied to the full range of phylogenetic models.
- Published
- 2021
- Full Text
- View/download PDF
21. Scalable Empirical Mixture Models That Account for Across-Site Compositional Heterogeneity.
- Author
-
Schrempf D, Lartillot N, and Szöllősi G
- Subjects
- Cluster Analysis, Amino Acid Substitution, Genetic Techniques, Models, Genetic, Phylogeny, Software
- Abstract
Biochemical demands constrain the range of amino acids acceptable at specific sites resulting in across-site compositional heterogeneity of the amino acid replacement process. Phylogenetic models that disregard this heterogeneity are prone to systematic errors, which can lead to severe long-branch attraction artifacts. State-of-the-art models accounting for across-site compositional heterogeneity include the CAT model, which is computationally expensive, and empirical distribution mixture models estimated via maximum likelihood (C10-C60 models). Here, we present a new, scalable method EDCluster for finding empirical distribution mixture models involving a simple cluster analysis. The cluster analysis utilizes specific coordinate transformations which allow the detection of specialized amino acid distributions either from curated databases or from the alignment at hand. We apply EDCluster to the HOGENOM and HSSP databases in order to provide universal distribution mixture (UDM) models comprising up to 4,096 components. Detailed analyses of the UDM models demonstrate the removal of various long-branch attraction artifacts and improved performance compared with the C10-C60 models. Ready-to-use implementations of the UDM models are provided for three established software packages (IQ-TREE, Phylobayes, and RevBayes)., (© The Author(s) 2020. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.)
- Published
- 2020
- Full Text
- View/download PDF
22. From Inquilines to Gall Inducers: Genomic Signature of a Life-Style Transition in Synergus Gall Wasps.
- Author
-
Gobbo E, Lartillot N, Hearn J, Stone GN, Abe Y, Wheat CW, Ide T, and Ronquist F
- Subjects
- Animals, Gene Duplication, Models, Genetic, Quercus parasitology, Biological Evolution, Genome, Insect, Plant Tumors parasitology, Selection, Genetic, Wasps physiology
- Abstract
Gall wasps (Hymenoptera: Cynipidae) induce complex galls on oaks, roses, and other plants, but the mechanism of gall induction is still unknown. Here, we take a comparative genomic approach to revealing the genetic basis of gall induction. We focus on Synergus itoensis, a species that induces galls inside oak acorns. Previous studies suggested that this species evolved the ability to initiate gall formation recently, as it is deeply nested within the genus Synergus, whose members are mostly inquilines that develop inside the galls of other species. We compared the genome of S. itoensis with that of three related Synergus inquilines to identify genomic changes associated with the origin of gall induction. We used a novel Bayesian selection analysis, which accounts for branch-specific and gene-specific selection effects, to search for signatures of selection in 7,600 single-copy orthologous genes shared by the four Synergus species. We found that the terminal branch leading to S. itoensis had more genes with a significantly elevated dN/dS ratio (positive signature genes) than the other terminal branches in the tree; the S. itoensis branch also had more genes with a significantly decreased dN/dS ratio. Gene set enrichment analysis showed that the positive signature gene set of S. itoensis, unlike those of the inquiline species, is enriched in several biological process Gene Ontology terms, the most prominent of which is "Ovarian Follicle Cell Development." Our results indicate that the origin of gall induction is associated with distinct genomic changes, and provide a good starting point for further characterization of the genes involved., (© The Author(s) 2020. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.)
- Published
- 2020
- Full Text
- View/download PDF
23. Detecting adaptive convergent amino acid evolution.
- Author
-
Rey C, Lanore V, Veber P, Guéguen L, Lartillot N, Sémon M, and Boussau B
- Subjects
- Amino Acids metabolism, Animals, Genomics, Humans, Models, Genetic, Phylogeny, Proteins metabolism, Amino Acids genetics, Evolution, Molecular, Proteins genetics
- Abstract
In evolutionary genomics, researchers have taken an interest in identifying substitutions that subtend convergent phenotypic adaptations. This is a difficult question that requires distinguishing foreground convergent substitutions that are involved in the convergent phenotype from background convergent substitutions. Those may be linked to other adaptations, may be neutral or may be the consequence of mutational biases. Furthermore, there is no generally accepted definition of convergent substitutions. Various methods that use different definitions have been proposed in the literature, resulting in different sets of candidate foreground convergent substitutions. In this article, we first describe the processes that can generate foreground convergent substitutions in coding sequences, separating adaptive from non-adaptive processes. Second, we review methods that have been proposed to detect foreground convergent substitutions in coding sequences and expose the assumptions that underlie them. Finally, we examine their power on simulations of convergent changes-including in the presence of a change in the efficacy of selection-and on empirical alignments. This article is part of the theme issue 'Convergent evolution in the genomics era: new insights and directions'.
- Published
- 2019
- Full Text
- View/download PDF
24. Erratum: Life History Traits Impact the Nuclear Rate of Substitution but Not the Mitochondrial Rate in Isopods.
- Author
-
Saclier N, François CM, Konecny-Dupré L, Lartillot N, Guéguen L, Duret L, Malard F, Douady CJ, and Lefébure T
- Published
- 2019
- Full Text
- View/download PDF
25. Life History Traits Impact the Nuclear Rate of Substitution but Not the Mitochondrial Rate in Isopods.
- Author
-
Saclier N, François CM, Konecny-Dupré L, Lartillot N, Guéguen L, Duret L, Malard F, Douady CJ, and Lefébure T
- Subjects
- Animals, DNA Replication, Ecosystem, Electron Transport, Isopoda metabolism, Isopoda radiation effects, Protein Biosynthesis, Selection, Genetic, Evolution, Molecular, Genome, Mitochondrial, Isopoda genetics, Life History Traits
- Abstract
The rate of molecular evolution varies widely among species. Life history traits (LHTs) have been proposed as a major driver of these variations. However, the relative contribution of each trait is poorly understood. Here, we test the influence of metabolic rate (MR), longevity, and generation time (GT) on the nuclear and mitochondrial synonymous substitution rates using a group of isopod species that have made multiple independent transitions to subterranean environments. Subterranean species have repeatedly evolved a lower MR, a longer lifespan and a longer GT. We assembled the nuclear transcriptomes and the mitochondrial genomes of 13 pairs of closely related isopods, each pair composed of one surface and one subterranean species. We found that subterranean species have a lower rate of nuclear synonymous substitution than surface species whereas the mitochondrial rate remained unchanged. We propose that this decoupling between nuclear and mitochondrial rates comes from different DNA replication processes in these two compartments. In isopods, the nuclear rate is probably tightly controlled by GT alone. In contrast, mitochondrial genomes appear to replicate and mutate at a rate independent of LHTs. These results are incongruent with previous studies, which were mostly devoted to vertebrates. We suggest that this incongruence can be explained by developmental differences between animal clades, with a quiescent period during female gametogenesis in mammals and birds which imposes a nuclear and mitochondrial rate coupling, as opposed to the continuous gametogenesis observed in most arthropods.
- Published
- 2018
- Full Text
- View/download PDF
26. Conditional Approximate Bayesian Computation: A New Approach for Across-Site Dependency in High-Dimensional Mutation-Selection Models.
- Author
-
Laurin-Lemay S, Rodrigue N, Lartillot N, and Philippe H
- Subjects
- Animals, Bayes Theorem, Humans, Mammals genetics, Monte Carlo Method, Evolution, Molecular, Genetic Techniques, Models, Genetic, Mutation, Selection, Genetic
- Abstract
A key question in molecular evolutionary biology concerns the relative roles of mutation and selection in shaping genomic data. Moreover, features of mutation and selection are heterogeneous along the genome and over time. Mechanistic codon substitution models based on the mutation-selection framework are promising approaches to separating these effects. In practice, however, several complications arise, since accounting for such heterogeneities often implies handling models of high dimensionality (e.g., amino acid preferences), or leads to across-site dependence (e.g., CpG hypermutability), making the likelihood function intractable. Approximate Bayesian Computation (ABC) could address this latter issue. Here, we propose a new approach, named Conditional ABC (CABC), which combines the sampling efficiency of MCMC and the flexibility of ABC. To illustrate the potential of the CABC approach, we apply it to the study of mammalian CpG hypermutability based on a new mutation-level parameter implying dependence across adjacent sites, combined with site-specific purifying selection on amino-acids captured by a Dirichlet process. Our proof-of-concept of the CABC methodology opens new modeling perspectives. Our application of the method reveals a high level of heterogeneity of CpG hypermutability across loci and mild heterogeneity across taxonomic groups; and finally, we show that CpG hypermutability is an important evolutionary factor in rendering relative synonymous codon usage. All source code is available as a GitHub repository (https://github.com/Simonll/LikelihoodFreePhylogenetics.git).
- Published
- 2018
- Full Text
- View/download PDF
27. Correction: Molecular adaptation in Rubisco: Discriminating between convergent evolution and positive selection using mechanistic and classical codon models.
- Author
-
Parto S and Lartillot N
- Abstract
[This corrects the article DOI: 10.1371/journal.pone.0192697.].
- Published
- 2018
- Full Text
- View/download PDF
28. Molecular adaptation in Rubisco: Discriminating between convergent evolution and positive selection using mechanistic and classical codon models.
- Author
-
Parto S and Lartillot N
- Subjects
- Carbon Dioxide metabolism, Photosynthesis, Phylogeny, Codon, Ribulose-Bisphosphate Carboxylase metabolism
- Abstract
Rubisco (Ribulose-1, 5-biphosphate carboxylase/oxygenase) is the most important enzyme on earth, catalyzing the first step of photosynthetic CO2 fixation. So, without it, there would be no storing of the sun's energy in plants. Molecular adaptation of Rubisco to C4 photosynthetic pathway has attracted a lot of attention. C4 plants, which comprise less than 5% of land plants, have evolved more efficient photosynthesis compared to C3 plants. Interestingly, a large number of independent transitions from C3 to C4 phenotype have occurred. Each time, the Rubisco enzyme has been subject to similar changes in selective pressure, thus providing an excellent model for convergent evolution at the molecular level. Molecular adaptation is often identified with positive selection and is typically characterized by an elevated ratio of non-synonymous to synonymous substitution rate (dN/dS). However, convergent adaptation is expected to leave a different molecular signature, taking the form of repeated transitions toward identical or similar amino acids. Here, we used a previously introduced codon-based differential-selection model to detect and quantify consistent patterns of convergent adaptation in Rubisco in eudicots. We further contrasted our results with those obtained by classical codon models based on the estimation of dN/dS. We found that the two classes of models tend to select distinct, although overlapping, sets of positions. This discrepancy in the results illustrates the conceptual difference between these models while emphasizing the need to better discriminate between qualitatively different selective regimes, by using a broader class of codon models than those currently considered in molecular evolutionary studies.
- Published
- 2018
- Full Text
- View/download PDF
29. The Red Queen model of recombination hot-spot evolution: a theoretical investigation.
- Author
-
Latrille T, Duret L, and Lartillot N
- Subjects
- Animals, Mice, Models, Genetic, Gene Conversion, Genetic Variation, Recombination, Genetic
- Abstract
In humans and many other species, recombination events cluster in narrow and short-lived hot spots distributed across the genome, whose location is determined by the Zn-finger protein PRDM9. To explain these fast evolutionary dynamics, an intra-genomic Red Queen model has been proposed, based on the interplay between two antagonistic forces: biased gene conversion, mediated by double-strand breaks, resulting in hot-spot extinction, followed by positive selection favouring new PRDM9 alleles recognizing new sequence motifs. Thus far, however, this Red Queen model has not been formalized as a quantitative population-genetic model, fully accounting for the intricate interplay between biased gene conversion, mutation, selection, demography and genetic diversity at the PRDM9 locus. Here, we explore the population genetics of the Red Queen model of recombination. A Wright-Fisher simulator was implemented, allowing exploration of the behaviour of the model (mean equilibrium recombination rate, diversity at the PRDM9 locus or turnover rate) as a function of the parameters (effective population size, mutation and erosion rates). In a second step, analytical results based on self-consistent mean-field approximations were derived, reproducing the scaling relations observed in the simulations. Empirical fit of the model to current data from the mouse suggests both a high mutation rate at PRDM9 and strong biased gene conversion on its targets.This article is part of the themed issue 'Evolutionary causes and consequences of recombination rate variation in sexual organisms'., (© 2017 The Authors.)
- Published
- 2017
- Full Text
- View/download PDF
30. Improved Modeling of Compositional Heterogeneity Supports Sponges as Sister to All Other Animals.
- Author
-
Feuda R, Dohrmann M, Pett W, Philippe H, Rota-Stabelli O, Lartillot N, Wörheide G, and Pisani D
- Subjects
- Animals, Sequence Analysis, Protein, Biological Evolution, Phylogeny, Porifera classification
- Abstract
The relationships at the root of the animal tree have proven difficult to resolve, with the current debate focusing on whether sponges (phylum Porifera) or comb jellies (phylum Ctenophora) are the sister group of all other animals [1-5]. The choice of evolutionary models seems to be at the core of the problem because Porifera tends to emerge as the sister group of all other animals ("Porifera-sister") when site-specific amino acid differences are modeled (e.g., [6, 7]), whereas Ctenophora emerges as the sister group of all other animals ("Ctenophora-sister") when they are ignored (e.g., [8-11]). We show that two key phylogenomic datasets that previously supported Ctenophora-sister [10, 12] display strong heterogeneity in amino acid composition across sites and taxa and that no routinely used evolutionary model can adequately describe both forms of heterogeneity. We show that data-recoding methods [13-15] reduce compositional heterogeneity in these datasets and that models accommodating site-specific amino acid preferences can better describe the recoded datasets. Increased model adequacy is associated with significant topological changes in support of Porifera-sister. Because adequate modeling of the evolutionary process that generated the data is fundamental to recovering an accurate phylogeny [16-20], our results strongly support sponges as the sister group of all other animals and provide further evidence that Ctenophora-sister represents a tree reconstruction artifact. VIDEO ABSTRACT., (Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.)
- Published
- 2017
- Full Text
- View/download PDF
31. Detecting consistent patterns of directional adaptation using differential selection codon models.
- Author
-
Parto S and Lartillot N
- Subjects
- Amino Acid Sequence, Evolution, Molecular, HIV physiology, Humans, Mutation, Open Reading Frames, Phylogeny, Selection, Genetic, gag Gene Products, Human Immunodeficiency Virus chemistry, gag Gene Products, Human Immunodeficiency Virus genetics, Bayes Theorem, Codon, HIV genetics, Models, Genetic
- Abstract
Background: Phylogenetic codon models are often used to characterize the selective regimes acting on protein-coding sequences. Recent methodological developments have led to models explicitly accounting for the interplay between mutation and selection, by modeling the amino acid fitness landscape along the sequence. However, thus far, most of these models have assumed that the fitness landscape is constant over time. Fluctuations of the fitness landscape may often be random or depend on complex and unknown factors. However, some organisms may be subject to systematic changes in selective pressure, resulting in reproducible molecular adaptations across independent lineages subject to similar conditions., Results: Here, we introduce a codon-based differential selection model, which aims to detect and quantify the fine-grained consistent patterns of adaptation at the protein-coding level, as a function of external conditions experienced by the organism under investigation. The model parameterizes the global mutational pressure, as well as the site- and condition-specific amino acid selective preferences. This phylogenetic model is implemented in a Bayesian MCMC framework. After validation with simulations, we applied our method to a dataset of HIV sequences from patients with known HLA genetic background. Our differential selection model detects and characterizes differentially selected coding positions specifically associated with two different HLA alleles., Conclusion: Our differential selection model is able to identify consistent molecular adaptations as a function of repeated changes in the environment of the organism. These models can be applied to many other problems, ranging from viral adaptation to evolution of life-history strategies in plants or animals.
- Published
- 2017
- Full Text
- View/download PDF
32. Detecting Adaptation in Protein-Coding Genes Using a Bayesian Site-Heterogeneous Mutation-Selection Codon Substitution Model.
- Author
-
Rodrigue N and Lartillot N
- Subjects
- Amino Acids genetics, Bayes Theorem, Computer Simulation, Epistasis, Genetic, Evolution, Molecular, Genetic Heterogeneity, Mutation, Mutation Rate, Phylogeny, Adaptation, Biological genetics, Amino Acid Substitution, Codon, Models, Genetic, Selection, Genetic genetics
- Abstract
Codon substitution models have traditionally attempted to uncover signatures of adaptation within protein-coding genes by contrasting the rates of synonymous and non-synonymous substitutions. Another modeling approach, known as the mutation-selection framework, attempts to explicitly account for selective patterns at the amino acid level, with some approaches allowing for heterogeneity in these patterns across codon sites. Under such a model, substitutions at a given position occur at the neutral or nearly neutral rate when they are synonymous, or when they correspond to replacements between amino acids of similar fitness; substitutions from high to low (low to high) fitness amino acids have comparatively low (high) rates. Here, we study the use of such a mutation-selection framework as a null model for the detection of adaptation. Following previous works in this direction, we include a deviation parameter that has the effect of capturing the surplus, or deficit, in non-synonymous rates, relative to what would be expected under a mutation-selection modeling framework that includes a Dirichlet process approach to account for across-codon-site variation in amino acid fitness profiles. We use simulations, along with a few real data sets, to study the behavior of the approach, and find it to have good power with a low false-positive rate. Altogether, we emphasize the potential of recent mutation-selection models in the detection of adaptation, calling for further model refinements as well as large-scale applications., (© The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.)
- Published
- 2017
- Full Text
- View/download PDF
33. Closing the gap between rocks and clocks using total-evidence dating.
- Author
-
Ronquist F, Lartillot N, and Phillips MJ
- Subjects
- Animals, Calibration, Evolution, Molecular, Phylogeny, Time, Biological Evolution, Fossils anatomy & histology, Mammals anatomy & histology, Mammals genetics
- Abstract
Total-evidence dating (TED) allows evolutionary biologists to incorporate a wide range of dating information into a unified statistical analysis. One might expect this to improve the agreement between rocks and clocks but this is not necessarily the case. We explore the reasons for such discordance using a mammalian dataset with rich molecular, morphological and fossil information. There is strong conflict in this dataset between morphology and molecules under standard stochastic models. This causes TED to push divergence events back in time when using inadequate models or vague priors, a phenomenon we term 'deep root attraction' (DRA). We identify several causes of DRA. Failure to account for diversified sampling results in dramatic DRA, but this can be addressed using existing techniques. Inadequate morphological models also appear to be a major contributor to DRA. The major reason seems to be that current models do not account for dependencies among morphological characters, causing distorted topology and branch length estimates. This is particularly problematic for huge morphological datasets, which may contain large numbers of correlated characters. Finally, diversification and fossil sampling priors that do not incorporate all the available background information can contribute to DRA, but these priors can also be used to compensate for DRA. Specifically, we show that DRA in the mammalian dataset can be addressed by introducing a modest extra penalty for ghost lineages that are unobserved in the fossil record, for instance by assuming rapid diversification, rare extinction or high fossil sampling rate; any of these assumptions produces highly congruent divergence time estimates with a minimal gap between rocks and clocks. Under these conditions, fossils have a stabilizing influence on divergence time estimates and significantly increase the precision of those estimates, which are generally close to the dates suggested by palaeontologists.This article is part of the themed issue 'Dating species divergences using rocks and clocks'., (© 2016 The Authors.)
- Published
- 2016
- Full Text
- View/download PDF
34. A mixed relaxed clock model.
- Author
-
Lartillot N, Phillips MJ, and Ronquist F
- Subjects
- Animals, Bayes Theorem, Time Factors, Evolution, Molecular, Fossils anatomy & histology, Mammals genetics, Phylogeny
- Abstract
Over recent years, several alternative relaxed clock models have been proposed in the context of Bayesian dating. These models fall in two distinct categories: uncorrelated and autocorrelated across branches. The choice between these two classes of relaxed clocks is still an open question. More fundamentally, the true process of rate variation may have both long-term trends and short-term fluctuations, suggesting that more sophisticated clock models unfolding over multiple time scales should ultimately be developed. Here, a mixed relaxed clock model is introduced, which can be mechanistically interpreted as a rate variation process undergoing short-term fluctuations on the top of Brownian long-term trends. Statistically, this mixed clock represents an alternative solution to the problem of choosing between autocorrelated and uncorrelated relaxed clocks, by proposing instead to combine their respective merits. Fitting this model on a dataset of 105 placental mammals, using both node-dating and tip-dating approaches, suggests that the two pure clocks, Brownian and white noise, are rejected in favour of a mixed model with approximately equal contributions for its uncorrelated and autocorrelated components. The tip-dating analysis is particularly sensitive to the choice of the relaxed clock model. In this context, the classical pure Brownian relaxed clock appears to be overly rigid, leading to biases in divergence time estimation. By contrast, the use of a mixed clock leads to more recent and more reasonable estimates for the crown ages of placental orders and superorders. Altogether, the mixed clock introduced here represents a first step towards empirically more adequate models of the patterns of rate variation across phylogenetic trees.This article is part of the themed issue 'Dating species divergences using rocks and clocks'., (© 2016 The Authors.)
- Published
- 2016
- Full Text
- View/download PDF
35. RevBayes: Bayesian Phylogenetic Inference Using Graphical Models and an Interactive Model-Specification Language.
- Author
-
Höhna S, Landis MJ, Heath TA, Boussau B, Lartillot N, Moore BR, Huelsenbeck JP, and Ronquist F
- Subjects
- Bayes Theorem, Classification methods, Models, Biological, Phylogeny, Software
- Abstract
Programs for Bayesian inference of phylogeny currently implement a unique and fixed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be specified interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-specification language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous flexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our field. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.]., (© The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.)
- Published
- 2016
- Full Text
- View/download PDF
36. Reply to Halanych et al.: Ctenophore misplacement is corroborated by independent datasets.
- Author
-
Pisani D, Pett W, Dohrmann M, Feuda R, Rota-Stabelli O, Philippe H, Lartillot N, and Wörheide G
- Subjects
- Animals, Ctenophora classification, Ctenophora genetics, Databases, Genetic, Genome
- Published
- 2016
- Full Text
- View/download PDF
37. Genomic data do not support comb jellies as the sister group to all other animals.
- Author
-
Pisani D, Pett W, Dohrmann M, Feuda R, Rota-Stabelli O, Philippe H, Lartillot N, and Wörheide G
- Subjects
- Animals, Bayes Theorem, Bias, Likelihood Functions, Models, Genetic, Phylogeny, Reproducibility of Results, Selection, Genetic, Ctenophora classification, Ctenophora genetics, Databases, Genetic, Genome
- Abstract
Understanding how complex traits, such as epithelia, nervous systems, muscles, or guts, originated depends on a well-supported hypothesis about the phylogenetic relationships among major animal lineages. Traditionally, sponges (Porifera) have been interpreted as the sister group to the remaining animals, a hypothesis consistent with the conventional view that the last common animal ancestor was relatively simple and more complex body plans arose later in evolution. However, this premise has recently been challenged by analyses of the genomes of comb jellies (Ctenophora), which, instead, found ctenophores as the sister group to the remaining animals (the "Ctenophora-sister" hypothesis). Because ctenophores are morphologically complex predators with true epithelia, nervous systems, muscles, and guts, this scenario implies these traits were either present in the last common ancestor of all animals and were lost secondarily in sponges and placozoans (Trichoplax) or, alternatively, evolved convergently in comb jellies. Here, we analyze representative datasets from recent studies supporting Ctenophora-sister, including genome-scale alignments of concatenated protein sequences, as well as a genomic gene content dataset. We found no support for Ctenophora-sister and conclude it is an artifact resulting from inadequate methodology, especially the use of simplistic evolutionary models and inappropriate choice of species to root the metazoan tree. Our results reinforce a traditional scenario for the evolution of complexity in animals, and indicate that inferences about the evolution of Metazoa based on the Ctenophora-sister hypothesis are not supported by the currently available data.
- Published
- 2015
- Full Text
- View/download PDF
38. Probabilistic models of eukaryotic evolution: time for integration.
- Author
-
Lartillot N
- Subjects
- Bayes Theorem, Genomics methods, Monte Carlo Method, Biological Evolution, Eukaryotic Cells cytology, Models, Genetic
- Abstract
In spite of substantial work and recent progress, a global and fully resolved picture of the macroevolutionary history of eukaryotes is still under construction. This concerns not only the phylogenetic relations among major groups, but also the general characteristics of the underlying macroevolutionary processes, including the patterns of gene family evolution associated with endosymbioses, as well as their impact on the sequence evolutionary process. All these questions raise formidable methodological challenges, calling for a more powerful statistical paradigm. In this direction, model-based probabilistic approaches have played an increasingly important role. In particular, improved models of sequence evolution accounting for heterogeneities across sites and across lineages have led to significant, although insufficient, improvement in phylogenetic accuracy. More recently, one main trend has been to move away from simple parametric models and stepwise approaches, towards integrative models explicitly considering the intricate interplay between multiple levels of macroevolutionary processes. Such integrative models are in their infancy, and their application to the phylogeny of eukaryotes still requires substantial improvement of the underlying models, as well as additional computational developments., (© 2015 The Author(s).)
- Published
- 2015
- Full Text
- View/download PDF
39. The red queen model of recombination hotspots evolution in the light of archaic and modern human genomes.
- Author
-
Lesecque Y, Glémin S, Lartillot N, Mouchiroud D, and Duret L
- Subjects
- Animals, Chromosomes genetics, DNA-Binding Proteins, Gene Conversion, Genome, Human, Humans, Meiosis genetics, Pan troglodytes, Crossing Over, Genetic, Evolution, Molecular, Histone-Lysine N-Methyltransferase genetics, Recombination, Genetic
- Abstract
Recombination is an essential process in eukaryotes, which increases diversity by disrupting genetic linkage between loci and ensures the proper segregation of chromosomes during meiosis. In the human genome, recombination events are clustered in hotspots, whose location is determined by the PRDM9 protein. There is evidence that the location of hotspots evolves rapidly, as a consequence of changes in PRDM9 DNA-binding domain. However, the reasons for these changes and the rate at which they occur are not known. In this study, we investigated the evolution of human hotspot loci and of PRDM9 target motifs, both in modern and archaic human lineages (Denisovan) to quantify the dynamic of hotspot turnover during the recent period of human evolution. We show that present-day human hotspots are young: they have been active only during the last 10% of the time since the divergence from chimpanzee, starting to be operating shortly before the split between Denisovans and modern humans. Surprisingly, however, our analyses indicate that Denisovan recombination hotspots did not overlap with modern human ones, despite sharing similar PRDM9 target motifs. We further show that high-affinity PRDM9 target motifs are subject to a strong self-destructive drive, known as biased gene conversion (BGC), which should lead to the loss of the majority of them in the next 3 MYR. This depletion of PRDM9 genomic targets is expected to decrease fitness, and thereby to favor new PRDM9 alleles binding different motifs. Our refined estimates of the age and life expectancy of human hotspots provide empirical evidence in support of the Red Queen hypothesis of recombination hotspots evolution.
- Published
- 2014
- Full Text
- View/download PDF
40. Monte Carlo algorithms for Brownian phylogenetic models.
- Author
-
Horvilleur B and Lartillot N
- Subjects
- Markov Chains, Models, Genetic, Monte Carlo Method, Algorithms, Phylogeny
- Abstract
Motivation: Brownian models have been introduced in phylogenetics for describing variation in substitution rates through time, with applications to molecular dating or to the comparative analysis of variation in substitution patterns among lineages. Thus far, however, the Monte Carlo implementations of these models have relied on crude approximations, in which the Brownian process is sampled only at the internal nodes of the phylogeny or at the midpoints along each branch, and the unknown trajectory between these sampled points is summarized by simple branchwise average substitution rates., Results: A more accurate Monte Carlo approach is introduced, explicitly sampling a fine-grained discretization of the trajectory of the (potentially multivariate) Brownian process along the phylogeny. Generic Monte Carlo resampling algorithms are proposed for updating the Brownian paths along and across branches. Specific computational strategies are developed for efficient integration of the finite-time substitution probabilities across branches induced by the Brownian trajectory. The mixing properties and the computational complexity of the resulting Markov chain Monte Carlo sampler scale reasonably with the discretization level, allowing practical applications with up to a few hundred discretization points along the entire depth of the tree. The method can be generalized to other Markovian stochastic processes, making it possible to implement a wide range of time-dependent substitution models with well-controlled computational precision., Availability: The program is freely available at www.phylobayes.org., (© The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.)
- Published
- 2014
- Full Text
- View/download PDF
41. Site-heterogeneous mutation-selection models within the PhyloBayes-MPI package.
- Author
-
Rodrigue N and Lartillot N
- Subjects
- Bayes Theorem, Codon, Models, Genetic, Software, Mutation, Phylogeny
- Abstract
Motivation: In recent years, there has been an increasing interest in the potential of codon substitution models for a variety of applications. However, the computational demands of these models have sometimes lead to the adoption of oversimplified assumptions, questionable statistical methods or a limited focus on small data sets., Results: Here, we offer a scalable, message-passing-interface-based Bayesian implementation of site-heterogeneous codon models in the mutation-selection framework. Our software jointly infers the global mutational parameters at the nucleotide level, the branch lengths of the tree and a Dirichlet process governing across-site variation at the amino acid level. We focus on an example estimation of the distribution of selection coefficients from an alignment of several hundred sequences of the influenza PB2 gene, and highlight the site-specific characterization enabled by such a modeling approach. Finally, we discuss future potential applications of the software for conducting evolutionary inferences., Availability and Implementation: The models are implemented within the PhyloBayes-MPI package, (available at phylobayes.org) along with usage details in the accompanying manual.
- Published
- 2014
- Full Text
- View/download PDF
42. A phylogenetic Kalman filter for ancestral trait reconstruction using molecular data.
- Author
-
Lartillot N
- Subjects
- Archaea growth & development, Base Composition, Data Interpretation, Statistical, Linear Models, Markov Chains, Models, Biological, Monte Carlo Method, Phenotype, RNA, Ribosomal genetics, Temperature, Algorithms, Archaea genetics, Bayes Theorem, Biological Evolution, Phylogeny
- Abstract
Motivation: Correlation between life history or ecological traits and genomic features such as nucleotide or amino acid composition can be used for reconstructing the evolutionary history of the traits of interest along phylogenies. Thus far, however, such ancestral reconstructions have been done using simple linear regression approaches that do not account for phylogenetic inertia. These reconstructions could instead be seen as a genuine comparative regression problem, such as formalized by classical generalized least-square comparative methods, in which the trait of interest and the molecular predictor are represented as correlated Brownian characters coevolving along the phylogeny., Results: Here, a Bayesian sampler is introduced, representing an alternative and more efficient algorithmic solution to this comparative regression problem, compared with currently existing generalized least-square approaches. Technically, ancestral trait reconstruction based on a molecular predictor is shown to be formally equivalent to a phylogenetic Kalman filter problem, for which backward and forward recursions are developed and implemented in the context of a Markov chain Monte Carlo sampler. The comparative regression method results in more accurate reconstructions and a more faithful representation of uncertainty, compared with simple linear regression. Application to the reconstruction of the evolution of optimal growth temperature in Archaea, using GC composition in ribosomal RNA stems and amino acid composition of a sample of protein-coding genes, confirms previous findings, in particular, pointing to a hyperthermophilic ancestor for the kingdom., Availability and Implementation: The program is freely available at www.phylobayes.org.
- Published
- 2014
- Full Text
- View/download PDF
43. An experimentally tested scenario for the structural evolution of eukaryotic Cys2His2 zinc fingers from eubacterial ros homologs.
- Author
-
Netti F, Malgieri G, Esposito S, Palmieri M, Baglivo I, Isernia C, Omichinski JG, Pedone PV, Lartillot N, and Fattorusso R
- Subjects
- Agrobacterium tumefaciens genetics, Amino Acid Sequence, Bacteria chemistry, Bacteria genetics, Bacterial Proteins genetics, Binding Sites, Gene Transfer, Horizontal, Protein Structure, Secondary, Protein Structure, Tertiary, Sequence Alignment, Agrobacterium tumefaciens chemistry, Bacterial Proteins chemistry, Evolution, Molecular, Zinc Fingers
- Abstract
The exact evolutionary origin of the zinc finger (ZF) domain is unknown, as it is still not clear from which organisms it was first derived. However, the unique features of the ZF domains have made it very easy for evolution to tinker with them in a number of different manners, including their combination, variation of their number by unequal crossing-over or tandem duplication and tuning of their affinity for specific DNA sequence motifs through point substitutions. Classical Cys2His2 ZF domains as structurally autonomous motifs arranged in multiple copies are known only in eukaryotes. Nonetheless, a single prokaryotic Cys2His2 ZF domain has been identified in the transcriptional regulator Ros from Agrobacterium tumefaciens and recently characterized. The present work focuses on the evolution of the classical ZF domains with the goal of trying to determine whether eukaryotic ZFs have evolved from the prokaryotic Ros-like proteins. Our results, based on computational and experimental data, indicate that a single insertion of three amino acids in the short loop that separates the β-sheet from the α-helix of the Ros protein is sufficient to induce a structural transition from a Ros like to an eukaryotic-ZF like structure. This observation provides evidence for a structurally plausible and parsimonious scenario of fold evolution, giving a structural basis to the hypothesis of a horizontal gene transfer (HGT) from bacteria to eukaryotes.
- Published
- 2013
- Full Text
- View/download PDF
44. PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment.
- Author
-
Lartillot N, Rodrigue N, Stubbs D, and Richer J
- Subjects
- Algorithms, Bayes Theorem, Models, Genetic, Computational Biology methods, Phylogeny, Software
- Abstract
Modeling across site variation of the substitution process is increasingly recognized as important for obtaining more accurate phylogenetic reconstructions. Both finite and infinite mixture models have been proposed and have been shown to significantly improve on classical single-matrix models. Compared with their finite counterparts, infinite mixtures have a greater expressivity. However, they are computationally more challenging. This has resulted in practical compromises in the design of infinite mixture models. In particular, a fast but simplified version of a Dirichlet process model over equilibrium frequency profiles implemented in PhyloBayes has often been used in recent phylogenomics studies, while more refined model structures, more realistic and empirically more fit, have been practically out of reach. We introduce a message passing interface version of PhyloBayes, implementing the Dirichlet process mixture models as well as more classical empirical matrices and finite mixtures. The parallelization is made efficient thanks to the combination of two algorithmic strategies: a partial Gibbs sampling update of the tree topology and the use of a truncated stick-breaking representation for the Dirichlet process prior. The implementation shows close to linear gains in computational speed for up to 64 cores, thus allowing faster phylogenetic reconstruction under complex mixture models. PhyloBayes MPI is freely available from our website www.phylobayes.org.
- Published
- 2013
- Full Text
- View/download PDF
45. Lateral gene transfer from the dead.
- Author
-
Szöllosi GJ, Tannier E, Lartillot N, and Daubin V
- Subjects
- Algorithms, Biodiversity, Biological Evolution, Genetic Speciation, Models, Genetic, Sequence Analysis, DNA, Cyanobacteria genetics, DNA, Bacterial analysis, Evolution, Molecular, Gene Transfer, Horizontal, Genes, Bacterial, Phylogeny
- Abstract
In phylogenetic studies, the evolution of molecular sequences is assumed to have taken place along the phylogeny traced by the ancestors of extant species. In the presence of lateral gene transfer, however, this may not be the case, because the species lineage from which a gene was transferred may have gone extinct or not have been sampled. Because it is not feasible to specify or reconstruct the complete phylogeny of all species, we must describe the evolution of genes outside the represented phylogeny by modeling the speciation dynamics that gave rise to the complete phylogeny. We demonstrate that if the number of sampled species is small compared with the total number of existing species, the overwhelming majority of gene transfers involve speciation to and evolution along extinct or unsampled lineages. We show that the evolution of genes along extinct or unsampled lineages can to good approximation be treated as those of independently evolving lineages described by a few global parameters. Using this result, we derive an algorithm to calculate the probability of a gene tree and recover the maximum-likelihood reconciliation given the phylogeny of the sampled species. Examining 473 near-universal gene families from 36 cyanobacteria, we find that nearly a third of transfer events (28%) appear to have topological signatures of evolution along extinct species, but only approximately 6% of transfers trace their ancestry to before the common ancestor of the sampled cyanobacteria.
- Published
- 2013
- Full Text
- View/download PDF
46. Phylogenetic patterns of GC-biased gene conversion in placental mammals and the evolutionary dynamics of recombination landscapes.
- Author
-
Lartillot N
- Subjects
- Algorithms, Animals, Base Composition, Bayes Theorem, Computer Simulation, CpG Islands, Gene Conversion, Genetic Loci, Genetic Variation, Genome, Humans, Mammals genetics, Markov Chains, Monte Carlo Method, Phylogeny, Primates genetics, Evolution, Molecular, Models, Genetic, Recombination, Genetic
- Abstract
GC-biased gene conversion (gBGC) is a major evolutionary force shaping genomic nucleotide landscapes, distorting the estimation of the strength of selection, and having potentially deleterious effects on genome-wide fitness. Yet, a global quantitative picture, at large evolutionary scale, of the relative strength of gBGC compared with selection and random drift is still lacking. Furthermore, owing to its dependence on the local recombination rate, gBGC results in modulations of the substitution patterns along genomes and across time which, if correctly interpreted, may yield quantitative insights into the long-term evolutionary dynamics of recombination landscapes. Deriving a model of the substitution process at putatively neutral nucleotide positions from population-genetics arguments, and accounting for among-lineage and among-gene effects, we propose a reconstruction of the variation in gBGC intensity at the scale of placental mammals, and of its scaling with body-size and karyotypic traits. Our results are compatible with a simple population genetics model relating gBGC to effective population size and recombination rate. In addition, among-gene variation and phylogenetic patterns of exon-specific levels of gBGC reveal the presence of rugged recombination landscapes, and suggest that short-lived recombination hot-spots are a general feature of placentals. Across placental mammals, variation in gBGC strength spans two orders of magnitude, at its lowest in apes, strongest in lagomorphs, microbats or tenrecs, and near or above the nearly neutral threshold in most other lineages. Combined with among-gene variation, such high levels of biased gene conversion are likely to significantly impact midly selected positions, and to represent a substantial mutation load. Altogether, our analysis suggests a more important role of gBGC in placental genome evolution, compared with what could have been anticipated from studies conducted in anthropoid primates.
- Published
- 2013
- Full Text
- View/download PDF
47. Interaction between selection and biased gene conversion in mammalian protein-coding sequence evolution revealed by a phylogenetic covariance analysis.
- Author
-
Lartillot N
- Subjects
- Algorithms, Animals, Base Composition, Female, Genetics, Population, Humans, Male, Mammals genetics, Population Density, Evolution, Molecular, Gene Conversion, Models, Genetic, Open Reading Frames genetics, Phylogeny, Selection, Genetic
- Abstract
According to the nearly-neutral model, variation in long-term effective population size among species should result in correlated variation in the ratio of nonsynonymous over synonymous substitution rates (dN/dS). Previous empirical investigations in mammals have been consistent with this prediction, suggesting an important role for nearly-neutral effects on protein-coding sequence evolution. GC-biased gene conversion (gBGC), on the other hand, is increasingly recognized as a major evolutionary force shaping genome nucleotide composition. When sufficiently strong compared with random drift, gBGC may significantly interfere with a nearly-neutral regime and impact dN/dS in a complex manner. Here, we investigate the phylogenetic correlations between dN/dS, the equilibrium GC composition (GC*), and several life-history and karyotypic traits in placental mammals. We show that the equilibrium GC composition decreases with body mass and increases with the number of chromosomes, suggesting a modulation of the strength of biased gene conversion due to changes in effective population size and genome-wide recombination rate. The variation in dN/dS is complex and only partially fits the prediction of the nearly-neutral theory. However, specifically restricting estimation of the dN/dS ratio on GC-conservative transversions, which are immune from gBGC, results in correlations that are more compatible with a nearly-neutral interpretation. Our investigation indicates the presence of complex interactions between selection and biased gene conversion and suggests that further mechanistic development is warranted, to tease out mutation, selection, drift, and conversion.
- Published
- 2013
- Full Text
- View/download PDF
48. Serine codon-usage bias in deep phylogenomics: pancrustacean relationships as a case study.
- Author
-
Rota-Stabelli O, Lartillot N, Philippe H, and Pisani D
- Subjects
- Amino Acids genetics, Animals, Arthropods classification, Arthropods genetics, Bias, Diphosphotransferases genetics, Models, Genetic, Codon genetics, Crustacea classification, Crustacea genetics, Genomics, Phylogeny, Serine genetics
- Abstract
Phylogenomic analyses of ancient relationships are usually performed using amino acid data, but it is unclear whether amino acids or nucleotides should be preferred. With the 2-fold aim of addressing this problem and clarifying pancrustacean relationships, we explored the signals in the 62 protein-coding genes carefully assembled by Regier et al. in 2010. With reference to the pancrustaceans, this data set infers a highly supported nucleotide tree that is substantially different to the corresponding, but poorly supported, amino acid one. We show that the discrepancy between the nucleotide-based and the amino acids-based trees is caused by substitutions within synonymous codon families (especially those of serine-TCN and AGY). We show that different arthropod lineages are differentially biased in their usage of serine, arginine, and leucine synonymous codons, and that the serine bias is correlated with the topology derived from the nucleotides, but not the amino acids. We suggest that a parallel, partially compositionally driven, synonymous codon-usage bias affects the nucleotide topology. As substitutions between serine codon families can proceed through threonine or cysteine intermediates, amino acid data sets might also be affected by the serine codon-usage bias. We suggest that a Dayhoff recoding strategy would partially ameliorate the effects of such bias. Although amino acids provide an alternative hypothesis of pancrustacean relationships, neither the nucleotides nor the amino acids version of this data set seems to bring enough genuine phylogenetic information to robustly resolve the relationships within group, which should still be considered unresolved.
- Published
- 2013
- Full Text
- View/download PDF
49. Reconstructing the phylogenetic history of long-term effective population size and life-history traits using patterns of amino acid replacement in mitochondrial genomes of mammals and birds.
- Author
-
Nabholz B, Uwimana N, and Lartillot N
- Subjects
- Animals, Base Composition, Biological Evolution, Birds physiology, Body Size, Evolution, Molecular, Female, Fossils, Mammals physiology, Models, Statistical, Mutation, Placenta, Population Density, Pregnancy, Sequence Alignment, Amino Acid Substitution, Birds genetics, Genome, Mitochondrial, Longevity genetics, Mammals genetics, Phylogeny
- Abstract
The nearly neutral theory, which proposes that most mutations are deleterious or close to neutral, predicts that the ratio of nonsynonymous over synonymous substitution rates (dN/dS), and potentially also the ratio of radical over conservative amino acid replacement rates (Kr/Kc), are negatively correlated with effective population size. Previous empirical tests, using life-history traits (LHT) such as body-size or generation-time as proxies for population size, have been consistent with these predictions. This suggests that large-scale phylogenetic reconstructions of dN/dS or Kr/Kc might reveal interesting macroevolutionary patterns in the variation in effective population size among lineages. In this work, we further develop an integrative probabilistic framework for phylogenetic covariance analysis introduced previously, so as to estimate the correlation patterns between dN/dS, Kr/Kc, and three LHT, in mitochondrial genomes of birds and mammals. Kr/Kc displays stronger and more stable correlations with LHT than does dN/dS, which we interpret as a greater robustness of Kr/Kc, compared with dN/dS, the latter being confounded by the high saturation of the synonymous substitution rate in mitochondrial genomes. The correlation of Kr/Kc with LHT was robust when controlling for the potentially confounding effects of nucleotide compositional variation between taxa. The positive correlation of the mitochondrial Kr/Kc with LHT is compatible with previous reports, and with a nearly neutral interpretation, although alternative explanations are also possible. The Kr/Kc model was finally used for reconstructing life-history evolution in birds and mammals. This analysis suggests a fairly large-bodied ancestor in both groups. In birds, life-history evolution seems to have occurred mainly through size reduction in Neoavian birds, whereas in placental mammals, body mass evolution shows disparate trends across subclades. Altogether, our work represents a further step toward a more comprehensive phylogenetic reconstruction of the evolution of life-history and of the population-genetics environment.
- Published
- 2013
- Full Text
- View/download PDF
50. The interface of protein structure, protein biophysics, and molecular evolution.
- Author
-
Liberles DA, Teichmann SA, Bahar I, Bastolla U, Bloom J, Bornberg-Bauer E, Colwell LJ, de Koning AP, Dokholyan NV, Echave J, Elofsson A, Gerloff DL, Goldstein RA, Grahnen JA, Holder MT, Lakner C, Lartillot N, Lovell SC, Naylor G, Perica T, Pollock DD, Pupko T, Regan L, Roger A, Rubinstein N, Shakhnovich E, Sjölander K, Sunyaev S, Teufel AI, Thorne JL, Thornton JW, Weinreich DM, and Whelan S
- Subjects
- Amino Acid Sequence, Animals, Humans, Models, Molecular, Molecular Sequence Data, Protein Conformation, Protein Folding, RNA, Messenger genetics, Sequence Alignment, Evolution, Molecular, Proteins chemistry, Proteins genetics
- Abstract
Abstract The interface of protein structural biology, protein biophysics, molecular evolution, and molecular population genetics forms the foundations for a mechanistic understanding of many aspects of protein biochemistry. Current efforts in interdisciplinary protein modeling are in their infancy and the state-of-the art of such models is described. Beyond the relationship between amino acid substitution and static protein structure, protein function, and corresponding organismal fitness, other considerations are also discussed. More complex mutational processes such as insertion and deletion and domain rearrangements and even circular permutations should be evaluated. The role of intrinsically disordered proteins is still controversial, but may be increasingly important to consider. Protein geometry and protein dynamics as a deviation from static considerations of protein structure are also important. Protein expression level is known to be a major determinant of evolutionary rate and several considerations including selection at the mRNA level and the role of interaction specificity are discussed. Lastly, the relationship between modeling and needed high-throughput experimental data as well as experimental examination of protein evolution using ancestral sequence resurrection and in vitro biochemistry are presented, towards an aim of ultimately generating better models for biological inference and prediction., (Copyright © 2012 The Protein Society.)
- Published
- 2012
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.