103 results on '"Lartillot N"'
Search Results
2. Genes and sites under adaptation at the phylogenetic scale also exhibit adaptation at the population-genetic scale
- Author
-
Latrille, T., primary, Rodrigue, N., additional, and Lartillot, N., additional
- Published
- 2022
- Full Text
- View/download PDF
3. An improved codon modeling approach for accurate estimation of the mutation bias
- Author
-
Latrille, T., primary and Lartillot, N., additional
- Published
- 2021
- Full Text
- View/download PDF
4. A theoretical approach for quantifying the impact of changes in effective population size and expression level on the rate of coding sequence evolution
- Author
-
Latrille, T., primary and Lartillot, N., additional
- Published
- 2021
- Full Text
- View/download PDF
5. Inferring long-term effective population size with Mutation-Selection models
- Author
-
Latrille, T., primary, Lanore, V., additional, and Lartillot, N., additional
- Published
- 2021
- Full Text
- View/download PDF
6. Reconstruction of body mass evolution in the Cetartiodactyla and mammals using phylogenomic data
- Author
-
Figuet, E., primary, Ballenghien, M., additional, Lartillot, N., additional, and Galtier, N., additional
- Published
- 2017
- Full Text
- View/download PDF
7. Interaction between Selection and Biased Gene Conversion in Mammalian Protein-Coding Sequence Evolution Revealed by a Phylogenetic Covariance Analysis
- Author
-
Lartillot, N., primary
- Published
- 2012
- Full Text
- View/download PDF
8. A Phylogenetic Model for Investigating Correlated Evolution of Substitution Rates and Continuous Phenotypic Characters
- Author
-
Lartillot, N., primary and Poujol, R., additional
- Published
- 2010
- Full Text
- View/download PDF
9. Computational Methods for Evaluating Phylogenetic Models of Coding Sequence Evolution with Dependence between Codons
- Author
-
Rodrigue, N., primary, Kleinman, C. L., additional, Philippe, H., additional, and Lartillot, N., additional
- Published
- 2009
- Full Text
- View/download PDF
10. A General Comparison of Relaxed Molecular Clock Models
- Author
-
Lepage, T., primary, Bryant, D., additional, Philippe, H., additional, and Lartillot, N., additional
- Published
- 2007
- Full Text
- View/download PDF
11. Animal evolution: the end of the intermediate taxa?
- Author
-
Adoutte, A., Balavoine, G., Lartillot, N., and Rosa, R. de
- Published
- 1999
- Full Text
- View/download PDF
12. Fast optimization of statistical potentials for structurally constrained phylogenetic models
- Author
-
Rodrigue Nicolas, Kleinman Claudia L, Bonnard Cécile, and Lartillot Nicolas
- Subjects
Evolution ,QH359-425 - Abstract
Abstract Background Statistical approaches for protein design are relevant in the field of molecular evolutionary studies. In recent years, new, so-called structurally constrained (SC) models of protein-coding sequence evolution have been proposed, which use statistical potentials to assess sequence-structure compatibility. In a previous work, we defined a statistical framework for optimizing knowledge-based potentials especially suited to SC models. Our method used the maximum likelihood principle and provided what we call the joint potentials. However, the method required numerical estimations by the use of computationally heavy Markov Chain Monte Carlo sampling algorithms. Results Here, we develop an alternative optimization procedure, based on a leave-one-out argument coupled to fast gradient descent algorithms. We assess that the leave-one-out potential yields very similar results to the joint approach developed previously, both in terms of the resulting potential parameters, and by Bayes factor evaluation in a phylogenetic context. On the other hand, the leave-one-out approach results in a considerable computational benefit (up to a 1,000 fold decrease in computational time for the optimization procedure). Conclusion Due to its computational speed, the optimization method we propose offers an attractive alternative for the design and empirical evaluation of alternative forms of potentials, using large data sets and high-dimensional parameterizations.
- Published
- 2009
- Full Text
- View/download PDF
13. Evaluation of the models handling heterotachy in phylogenetic inference
- Author
-
Philippe Hervé, Lartillot Nicolas, Rodrigue Nicolas, and Zhou Yan
- Subjects
Evolution ,QH359-425 - Abstract
Abstract Background The evolutionary rate at a given homologous position varies across time. When sufficiently pronounced, this phenomenon – called heterotachy – may produce artefactual phylogenetic reconstructions under the commonly used models of sequence evolution. These observations have motivated the development of models that explicitly recognize heterotachy, with research directions proposed along two main axes: 1) the covarion approach, where sites switch from variable to invariable states; and 2) the mixture of branch lengths (MBL) approach, where alignment patterns are assumed to arise from one of several sets of branch lengths, under a given phylogeny. Results Here, we report the first statistical comparisons contrasting the performance of covarion and MBL modeling strategies. Using simulations under heterotachous conditions, we explore the properties of three model comparison methods: the Akaike information criterion, the Bayesian information criterion, and cross validation. Although more time consuming, cross validation appears more reliable than AIC and BIC as it directly measures the predictive power of a model on 'future' data. We also analyze three large datasets (nuclear proteins of animals, mitochondrial proteins of mammals, and plastid proteins of plants), and find the optimal number of components of the MBL model to be two for all datasets, indicating that this model is preferred over the standard homogeneous model. However, the covarion model is always favored over the optimal MBL model. Conclusion We demonstrated, using three large datasets, that the covarion model is more efficient at handling heterotachy than the MBL model. This is probably due to the fact that the MBL model requires a serious increase in the number of parameters, as compared to two supplementary parameters of the covarion approach. Further improvements of the both the mixture and the covarion approaches might be obtained by modeling heterogeneous behavior both along time and across sites.
- Published
- 2007
- Full Text
- View/download PDF
14. A maximum likelihood framework for protein design
- Author
-
Philippe Hervé, Bonnard Cécile, Rodrigue Nicolas, Kleinman Claudia L, and Lartillot Nicolas
- Subjects
Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background The aim of protein design is to predict amino-acid sequences compatible with a given target structure. Traditionally envisioned as a purely thermodynamic question, this problem can also be understood in a wider context, where additional constraints are captured by learning the sequence patterns displayed by natural proteins of known conformation. In this latter perspective, however, we still need a theoretical formalization of the question, leading to general and efficient learning methods, and allowing for the selection of fast and accurate objective functions quantifying sequence/structure compatibility. Results We propose a formulation of the protein design problem in terms of model-based statistical inference. Our framework uses the maximum likelihood principle to optimize the unknown parameters of a statistical potential, which we call an inverse potential to contrast with classical potentials used for structure prediction. We propose an implementation based on Markov chain Monte Carlo, in which the likelihood is maximized by gradient descent and is numerically estimated by thermodynamic integration. The fit of the models is evaluated by cross-validation. We apply this to a simple pairwise contact potential, supplemented with a solvent-accessibility term, and show that the resulting models have a better predictive power than currently available pairwise potentials. Furthermore, the model comparison method presented here allows one to measure the relative contribution of each component of the potential, and to choose the optimal number of accessibility classes, which turns out to be much higher than classically considered. Conclusion Altogether, this reformulation makes it possible to test a wide diversity of models, using different forms of potentials, or accounting for other factors than just the constraint of thermodynamic stability. Ultimately, such model-based statistical analyses may help to understand the forces shaping protein sequences, and driving their evolution.
- Published
- 2006
- Full Text
- View/download PDF
15. An Experimentally Tested Scenario for the Structural Evolution of Eukaryotic Cys2His2 Zinc Fingers from Eubacterial Ros Homologs
- Author
-
Sabrina Esposito, Fortuna Netti, James G. Omichinski, Paolo V. Pedone, Carla Isernia, Roberto Fattorusso, Nicolas Lartillot, Maddalena Palmieri, Gaetano Malgieri, Ilaria Baglivo, Seconda Università degli Studi di Napoli = Second University of Naples, Université de Montréal (UdeM), Bioinformatique, phylogénie et génomique évolutive (BPGE), Département PEGASE [LBBE] (PEGASE), Laboratoire de Biométrie et Biologie Evolutive - UMR 5558 (LBBE), Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-Université de Lyon-Institut National de Recherche en Informatique et en Automatique (Inria)-VetAgro Sup - Institut national d'enseignement supérieur et de recherche en alimentation, santé animale, sciences agronomiques et de l'environnement (VAS)-Centre National de la Recherche Scientifique (CNRS)-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-Université de Lyon-Institut National de Recherche en Informatique et en Automatique (Inria)-VetAgro Sup - Institut national d'enseignement supérieur et de recherche en alimentation, santé animale, sciences agronomiques et de l'environnement (VAS)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire de Biométrie et Biologie Evolutive - UMR 5558 (LBBE), Université de Lyon-Université de Lyon-Institut National de Recherche en Informatique et en Automatique (Inria)-VetAgro Sup - Institut national d'enseignement supérieur et de recherche en alimentation, santé animale, sciences agronomiques et de l'environnement (VAS)-Centre National de la Recherche Scientifique (CNRS), Seconda Università degli studi di Napoli, Netti, F, Malgieri, Gaetano, Esposito, Sabrina, Palmieri, M, Baglivo, I, Isernia, Carla, Omichinski, Jg, Pedone, Paolo Vincenzo, Lartillot, N, and Fattorusso, Roberto
- Subjects
Gene Transfer, Horizontal ,Sequence alignment ,Computational biology ,Biology ,010402 general chemistry ,01 natural sciences ,Protein Structure, Secondary ,DNA sequencing ,Evolution, Molecular ,03 medical and health sciences ,Bacterial Proteins ,[SDV.BBM.GTP]Life Sciences [q-bio]/Biochemistry, Molecular Biology/Genomics [q-bio.GN] ,evolution ,Genetics ,Transcriptional regulation ,Amino Acid Sequence ,Binding site ,Molecular Biology ,Peptide sequence ,Ecology, Evolution, Behavior and Systematics ,030304 developmental biology ,Zinc finger ,[STAT.AP]Statistics [stat]/Applications [stat.AP] ,0303 health sciences ,Binding Sites ,Bacteria ,[SDV.BID.EVO]Life Sciences [q-bio]/Biodiversity/Populations and Evolution [q-bio.PE] ,Zinc Fingers ,Protein Structure, Tertiary ,0104 chemical sciences ,phylogenetics analysi ,nuclear magnetic resonance ,Agrobacterium tumefaciens ,phylogenetics analysis ,Horizontal gene transfer ,zinc finger domain ,Tandem exon duplication ,Sequence Alignment ,[STAT.ME]Statistics [stat]/Methodology [stat.ME] - Abstract
International audience; The exact evolutionary origin of the zinc finger (ZF) domain is unknown, as it is still not clear from which organisms it was first derived. However, the unique features of the ZF domains have made it very easy for evolution to tinker with them in a number of different manners, including their combination, variation of their number by unequal crossing-over or tandem duplication and tuning of their affinity for specific DNA sequence motifs through point substitutions. Classical Cys 2 His 2 ZF domains as structurally autonomous motifs arranged in multiple copies are known only in eukaryotes. Nonetheless, a single prokaryotic Cys 2 His 2 ZF domain has been identified in the transcriptional regulator Ros from Agrobacterium tumefaciens and recently characterized. The present work focuses on the evolution of the classical ZF domains with the goal of trying to determine whether eukaryotic ZFs have evolved from the prokaryotic Ros-like proteins. Our results, based on computational and experimental data, indicate that a single insertion of three amino acids in the short loop that separates the b-sheet from the a-helix of the Ros protein is sufficient to induce a structural transition from a Ros like to an eukaryotic-ZF like structure. This observation provides evidence for a structurally plausible and parsimonious scenario of fold evolution, giving a structural basis to the hypothesis of a horizontal gene transfer (HGT) from bacteria to eukaryotes.
- Published
- 2013
- Full Text
- View/download PDF
16. 'Structural evidences of the evolution of prokaryotic Cys2His2 zinc finger domains'
- Author
-
F. Netti, M. Palmieri, I. Baglivo, J. G. Omichinski, N. Lartillot, MALGIERI, Gaetano, ESPOSITO, Sabrina, PEDONE, Paolo Vincenzo, ISERNIA, Carla, FATTORUSSO, Roberto, F, Netti, G, Malgieri, S, Esposito, M, Palmieri, I, Baglivo, PV, Pedone, C, Isernia, JG, Omichinski, N, Lartillot, R, Fattorusso, Netti, F., Malgieri, Gaetano, Esposito, Sabrina, Palmieri, M., Baglivo, I., Pedone, Paolo Vincenzo, Isernia, Carla, Omichinski, J. G., Lartillot, N., and Fattorusso, Roberto
- Published
- 2012
17. 'Evolution of classical zinc finger domains'
- Author
-
F. Netti, I. Baglivo, N. Lartillot, J. G. Omichinski, MALGIERI, Gaetano, ESPOSITO, Sabrina, PEDONE, Paolo Vincenzo, ISERNIA, Carla, FATTORUSSO, Roberto, F, Netti, G, Malgieri, S, Esposito, I, Baglivo, PV, Pedone, C, Isernia, N, Lartillot, JG, Omichinski, R, Fattorusso, Netti, F., Malgieri, Gaetano, Esposito, Sabrina, Baglivo, I., Pedone, Paolo Vincenzo, Isernia, Carla, Lartillot, N., Omichinski, J. G., and Fattorusso, Roberto
- Published
- 2011
18. Genome Streamlining: Effect of Mutation Rate and Population Size on Genome Size Reduction.
- Author
-
Luiselli J, Rouzaud-Cornabas J, Lartillot N, and Beslon G
- Subjects
- Evolution, Molecular, Models, Genetic, Bacteria genetics, Genome Size, Mutation Rate, Genome, Bacterial, Population Density
- Abstract
Genome streamlining, i.e. genome size reduction, is observed in bacteria with very different life traits, including endosymbiotic bacteria and several marine bacteria, raising the question of its evolutionary origin. None of the hypotheses proposed in the literature is firmly established, mainly due to the many confounding factors related to the diverse habitats of species with streamlined genomes. Computational models may help overcome these difficulties and rigorously test hypotheses. In this work, we used Aevol, a platform designed to study the evolution of genome architecture, to test 2 main hypotheses: that an increase in population size (N) or mutation rate (μ) could cause genome reduction. In our experiments, both conditions lead to streamlining but have very different resulting genome structures. Under increased population sizes, genomes lose a significant fraction of noncoding sequences but maintain their coding size, resulting in densely packed genomes (akin to streamlined marine bacteria genomes). By contrast, under an increased mutation rate, genomes lose both coding and noncoding sequences (akin to endosymbiotic bacteria genomes). Hence, both factors lead to an overall reduction in genome size, but the coding density of the genome appears to be determined by N×μ. Thus, a broad range of genome size and density can be achieved by different combinations of N and μ. Our results suggest that genome size and coding density are determined by the interplay between selection for phenotypic adaptation and selection for robustness., Competing Interests: Conflict of interest The authors declare no competing interests., (© The Author(s) 2024. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution.)
- Published
- 2024
- Full Text
- View/download PDF
19. Imbalanced speciation pulses sustain the radiation of mammals.
- Author
-
Quintero I, Lartillot N, and Morlon H
- Subjects
- Animals, Biodiversity, Extinction, Biological, Fossils, Genetic Speciation, Mammals classification, Mammals genetics, Phylogeny
- Abstract
The evolutionary histories of major clades, including mammals, often comprise changes in their diversification dynamics, but how these changes occur remains debated. We combined comprehensive phylogenetic and fossil information in a new "birth-death diffusion" model that provides a detailed characterization of variation in diversification rates in mammals. We found an early rising and sustained diversification scenario, wherein speciation rates increased before and during the Cretaceous-Paleogene (K-Pg) boundary. The K-Pg mass extinction event filtered out more slowly speciating lineages and was followed by a subsequent slowing in speciation rates rather than rebounds. These dynamics arose from an imbalanced speciation process, with separate lineages giving rise to many, less speciation-prone descendants. Diversity seems to have been brought about by these isolated, fast-speciating lineages, rather than by a few punctuated innovations.
- Published
- 2024
- Full Text
- View/download PDF
20. Bridging the gap between the evolutionary dynamics and the molecular mechanisms of meiosis: A model based exploration of the PRDM9 intra-genomic Red Queen.
- Author
-
Genestier A, Duret L, and Lartillot N
- Subjects
- Animals, Mice, Gene Conversion, DNA Breaks, Double-Stranded, Alleles, Models, Genetic, Humans, Recombination, Genetic, Histone-Lysine N-Methyltransferase genetics, Histone-Lysine N-Methyltransferase metabolism, Meiosis genetics, Evolution, Molecular
- Abstract
Molecular dissection of meiotic recombination in mammals, combined with population-genetic and comparative studies, have revealed a complex evolutionary dynamic characterized by short-lived recombination hotspots. Hotspots are chromosome positions containing DNA sequences where the protein PRDM9 can bind and cause crossing-over. To explain these fast evolutionary dynamic, a so-called intra-genomic Red Queen model has been proposed, based on the interplay between two antagonistic forces: biased gene conversion, mediated by double-strand breaks, resulting in hotspot extinction (the hotspot conversion paradox), followed by positive selection favoring mutant PRDM9 alleles recognizing new sequence motifs. Although this model predicts many empirical observations, the exact causes of the positive selection acting on new PRDM9 alleles is still not well understood. In this direction, experiment on mouse hybrids have suggested that, in addition to targeting double strand breaks, PRDM9 has another role during meiosis. Specifically, PRDM9 symmetric binding (simultaneous binding at the same site on both homologues) would facilitate homology search and, as a result, the pairing of the homologues. Although discovered in hybrids, this second function of PRDM9 could also be involved in the evolutionary dynamic observed within populations. To address this point, here, we present a theoretical model of the evolutionary dynamic of meiotic recombination integrating current knowledge about the molecular function of PRDM9. Our modeling work gives important insights into the selective forces driving the turnover of recombination hotspots. Specifically, the reduced symmetrical binding of PRDM9 caused by the loss of high affinity binding sites induces a net positive selection eliciting new PRDM9 alleles recognizing new targets. The model also offers new insights about the influence of the gene dosage of PRDM9, which can paradoxically result in negative selection on new PRDM9 alleles entering the population, driving their eviction and thus reducing standing variation at this locus., Competing Interests: The authors have declared that no competing interests exist., (Copyright: © 2024 Genestier et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)
- Published
- 2024
- Full Text
- View/download PDF
21. Compositionally Constrained Sites Drive Long-Branch Attraction.
- Author
-
Szánthó LL, Lartillot N, Szöllősi GJ, and Schrempf D
- Subjects
- Animals, Phylogeny, Bias, Models, Genetic, Microsporidia
- Abstract
Accurate phylogenies are fundamental to our understanding of the pattern and process of evolution. Yet, phylogenies at deep evolutionary timescales, with correspondingly long branches, have been fraught with controversy resulting from conflicting estimates from models with varying complexity and goodness of fit. Analyses of historical as well as current empirical datasets, such as alignments including Microsporidia, Nematoda, or Platyhelminthes, have demonstrated that inadequate modeling of across-site compositional heterogeneity, which is the result of biochemical constraints that lead to varying patterns of accepted amino acids along sequences, can lead to erroneous topologies that are strongly supported. Unfortunately, models that adequately account for across-site compositional heterogeneity remain computationally challenging or intractable for an increasing fraction of contemporary datasets. Here, we introduce "compositional constraint analysis," a method to investigate the effect of site-specific constraints on amino acid composition on phylogenetic inference. We show that more constrained sites with lower diversity and less constrained sites with higher diversity exhibit ostensibly conflicting signals under models ignoring across-site compositional heterogeneity that lead to long-branch attraction artifacts and demonstrate that more complex models accounting for across-site compositional heterogeneity can ameliorate this bias. We present CAT-posterior mean site frequencies (PMSF), a pipeline for diagnosing and resolving phylogenetic bias resulting from inadequate modeling of across-site compositional heterogeneity based on the CAT model. CAT-PMSF is robust against long-branch attraction in all alignments we have examined. We suggest using CAT-PMSF when convergence of the CAT model cannot be assured. We find evidence that compositionally constrained sites are driving long-branch attraction in two metazoan datasets and recover evidence for Porifera as the sister group to all other animals. [Animal phylogeny; cross-site heterogeneity; long-branch attraction; phylogenomics.]., (© The Author(s) 2023. Published by Oxford University Press on behalf of the Society of Systematic Biologists.)
- Published
- 2023
- Full Text
- View/download PDF
22. Identifying the Best Approximating Model in Bayesian Phylogenetics: Bayes Factors, Cross-Validation or wAIC?
- Author
-
Lartillot N
- Subjects
- Bayes Theorem, Computer Simulation, Probability, Markov Chains, Monte Carlo Method, Phylogeny
- Abstract
There is still no consensus as to how to select models in Bayesian phylogenetics, and more generally in applied Bayesian statistics. Bayes factors are often presented as the method of choice, yet other approaches have been proposed, such as cross-validation or information criteria. Each of these paradigms raises specific computational challenges, but they also differ in their statistical meaning, being motivated by different objectives: either testing hypotheses or finding the best-approximating model. These alternative goals entail different compromises, and as a result, Bayes factors, cross-validation, and information criteria may be valid for addressing different questions. Here, the question of Bayesian model selection is revisited, with a focus on the problem of finding the best-approximating model. Several model selection approaches were re-implemented, numerically assessed and compared: Bayes factors, cross-validation (CV), in its different forms (k-fold or leave-one-out), and the widely applicable information criterion (wAIC), which is asymptotically equivalent to leave-one-out cross-validation (LOO-CV). Using a combination of analytical results and empirical and simulation analyses, it is shown that Bayes factors are unduly conservative. In contrast, CV represents a more adequate formalism for selecting the model returning the best approximation of the data-generating process and the most accurate estimates of the parameters of interest. Among alternative CV schemes, LOO-CV and its asymptotic equivalent represented by the wAIC, stand out as the best choices, conceptually and computationally, given that both can be simultaneously computed based on standard Markov chain Monte Carlo runs under the posterior distribution. [Bayes factor; cross-validation; marginal likelihood; model comparison; wAIC.]., (© The Author(s) 2023. Published by Oxford University Press on behalf of the Society of Systematic Biologists.)
- Published
- 2023
- Full Text
- View/download PDF
23. Genes and sites under adaptation at the phylogenetic scale also exhibit adaptation at the population-genetic scale.
- Author
-
Latrille T, Rodrigue N, and Lartillot N
- Subjects
- Humans, Female, Pregnancy, Animals, Phylogeny, Placenta, Genetics, Population, Codon, Models, Genetic, Mammals genetics, Evolution, Molecular, Selection, Genetic
- Abstract
Adaptation in protein-coding sequences can be detected from multiple sequence alignments across species or alternatively by leveraging polymorphism data within a population. Across species, quantification of the adaptive rate relies on phylogenetic codon models, classically formulated in terms of the ratio of nonsynonymous over synonymous substitution rates. Evidence of an accelerated nonsynonymous substitution rate is considered a signature of pervasive adaptation. However, because of the background of purifying selection, these models are potentially limited in their sensitivity. Recent developments have led to more sophisticated mutation-selection codon models aimed at making a more detailed quantitative assessment of the interplay between mutation, purifying, and positive selection. In this study, we conducted a large-scale exome-wide analysis of placental mammals with mutation-selection models, assessing their performance at detecting proteins and sites under adaptation. Importantly, mutation-selection codon models are based on a population-genetic formalism and thus are directly comparable to the McDonald and Kreitman test at the population level to quantify adaptation. Taking advantage of this relationship between phylogenetic and population genetics analyses, we integrated divergence and polymorphism data across the entire exome for 29 populations across 7 genera and showed that proteins and sites detected to be under adaptation at the phylogenetic scale are also under adaptation at the population-genetic scale. Altogether, our exome-wide analysis shows that phylogenetic mutation-selection codon models and the population-genetic test of adaptation can be reconciled and are congruent, paving the way for integrative models and analyses across individuals and populations.
- Published
- 2023
- Full Text
- View/download PDF
24. An Improved Codon Modeling Approach for Accurate Estimation of the Mutation Bias.
- Author
-
Latrille T and Lartillot N
- Subjects
- Codon genetics, Models, Genetic, Mutation, Phylogeny, Genetic Code, Selection, Genetic
- Abstract
Phylogenetic codon models are routinely used to characterize selective regimes in coding sequences. Their parametric design, however, is still a matter of debate, in particular concerning the question of how to account for differing nucleotide frequencies and substitution rates. This problem relates to the fact that nucleotide composition in protein-coding sequences is the result of the interactions between mutation and selection. In particular, because of the structure of the genetic code, the nucleotide composition differs between the three coding positions, with the third position showing a more extreme composition. Yet, phylogenetic codon models do not correctly capture this phenomenon and instead predict that the nucleotide composition should be the same for all three positions. Alternatively, some models allow for different nucleotide rates at the three positions, an approach conflating the effects of mutation and selection on nucleotide composition. In practice, it results in inaccurate estimation of the strength of selection. Conceptually, the problem comes from the fact that phylogenetic codon models do not correctly capture the fixation bias acting against the mutational pressure at the mutation-selection equilibrium. To address this problem and to more accurately identify mutation rates and selection strength, we present an improved codon modeling approach where the fixation rate is not seen as a scalar, but as a tensor. This approach gives an accurate representation of how mutation and selection oppose each other at equilibrium and yields a reliable estimate of the mutational process, while disentangling the mean fixation probabilities prevailing in different mutational directions., (© The Author(s) 2022. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.)
- Published
- 2022
- Full Text
- View/download PDF
25. Natural Selection beyond Life? A Workshop Report.
- Author
-
Charlat S, Ariew A, Bourrat P, Ferreira Ruiz M, Heams T, Huneman P, Krishna S, Lachmann M, Lartillot N, Le Sergeant d'Hendecourt L, Malaterre C, Nghe P, Rajon E, Rivoire O, Smerlak M, and Zeravcic Z
- Abstract
Natural selection is commonly seen not just as an explanation for adaptive evolution, but as the inevitable consequence of "heritable variation in fitness among individuals". Although it remains embedded in biological concepts, such a formalisation makes it tempting to explore whether this precondition may be met not only in life as we know it, but also in other physical systems. This would imply that these systems are subject to natural selection and may perhaps be investigated in a biological framework, where properties are typically examined in light of their putative functions. Here we relate the major questions that were debated during a three-day workshop devoted to discussing whether natural selection may take place in non-living physical systems. We start this report with a brief overview of research fields dealing with "life-like" or "proto-biotic" systems, where mimicking evolution by natural selection in test tubes stands as a major objective. We contend the challenge may be as much conceptual as technical. Taking the problem from a physical angle, we then discuss the framework of dissipative structures. Although life is viewed in this context as a particular case within a larger ensemble of physical phenomena, this approach does not provide general principles from which natural selection can be derived. Turning back to evolutionary biology, we ask to what extent the most general formulations of the necessary conditions or signatures of natural selection may be applicable beyond biology. In our view, such a cross-disciplinary jump is impeded by reliance on individuality as a central yet implicit and loosely defined concept. Overall, these discussions thus lead us to conjecture that understanding, in physico-chemical terms, how individuality emerges and how it can be recognised, will be essential in the search for instances of evolution by natural selection outside of living systems.
- Published
- 2021
- Full Text
- View/download PDF
26. Inferring Long-Term Effective Population Size with Mutation-Selection Models.
- Author
-
Latrille T, Lanore V, and Lartillot N
- Subjects
- Animals, Bayes Theorem, Evolution, Molecular, Mammals, Mutation, Phylogeny, Population Density, Models, Genetic, Selection, Genetic
- Abstract
Mutation-selection phylogenetic codon models are grounded on population genetics first principles and represent a principled approach for investigating the intricate interplay between mutation, selection, and drift. In their current form, mutation-selection codon models are entirely characterized by the collection of site-specific amino-acid fitness profiles. However, thus far, they have relied on the assumption of a constant genetic drift, translating into a unique effective population size (Ne) across the phylogeny, clearly an unrealistic assumption. This assumption can be alleviated by introducing variation in Ne between lineages. In addition to Ne, the mutation rate (μ) is susceptible to vary between lineages, and both should covary with life-history traits (LHTs). This suggests that the model should more globally account for the joint evolutionary process followed by all of these lineage-specific variables (Ne, μ, and LHTs). In this direction, we introduce an extended mutation-selection model jointly reconstructing in a Bayesian Monte Carlo framework the fitness landscape across sites and long-term trends in Ne, μ, and LHTs along the phylogeny, from an alignment of DNA coding sequences and a matrix of observed LHTs in extant species. The model was tested against simulated data and applied to empirical data in mammals, isopods, and primates. The reconstructed history of Ne in these groups appears to correlate with LHTs or ecological variables in a way that suggests that the reconstruction is reasonable, at least in its global trends. On the other hand, the range of variation in Ne inferred across species is surprisingly narrow. This last point suggests that some of the assumptions of the model, in particular concerning the assumed absence of epistatic interactions between sites, are potentially problematic., (© The Author(s) 2021. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.)
- Published
- 2021
- Full Text
- View/download PDF
27. Reconstructing the History of Variation in Effective Population Size along Phylogenies.
- Author
-
Brevet M and Lartillot N
- Subjects
- Animals, Genetic Variation, Models, Genetic, Mutation Rate, Phylogeny, Population Density, Evolution, Molecular, Selection, Genetic
- Abstract
The nearly neutral theory predicts specific relations between effective population size (Ne) and patterns of divergence and polymorphism, which depend on the shape of the distribution of fitness effects (DFE) of new mutations. However, testing these relations is not straightforward, owing to the difficulty in estimating Ne. Here, we introduce an integrative framework allowing for an explicit reconstruction of the phylogenetic history of Ne, thus leading to a quantitative test of the nearly neutral theory and an estimation of the allometric scaling of the ratios of nonsynonymous over synonymous polymorphism (πN/πS) and divergence (dN/dS) with respect to Ne. As an illustration, we applied our method to primates, for which the nearly neutral predictions were mostly verified. Under a purely nearly neutral model with a constant DFE across species, we find that the variation in πN/πS and dN/dS as a function of Ne is too large to be compatible with current estimates of the DFE based on site frequency spectra. The reconstructed history of Ne shows a 10-fold variation across primates. The mutation rate per generation u, also reconstructed over the tree by the method, varies over a 3-fold range and is negatively correlated with Ne. As a result of these opposing trends for Ne and u, variation in πS is intermediate, primarily driven by Ne but substantially influenced by u. Altogether, our integrative framework provides a quantitative assessment of the role of Ne and u in modulating patterns of genetic variation, while giving a synthetic picture of their history over the clade., (© The Author(s) 2021. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.)
- Published
- 2021
- Full Text
- View/download PDF
28. Erratum to: A Bayesian mutation-selection framework for detecting site-specific adaptive evolution in protein-coding genes.
- Author
-
Rodrigue N, Latrille T, and Lartillot N
- Published
- 2021
- Full Text
- View/download PDF
29. Detecting sex-linked genes using genotyped individuals sampled in natural populations.
- Author
-
Käfer J, Lartillot N, Marais GAB, and Picard F
- Subjects
- Genes, Plant, Genes, X-Linked, Genes, Y-Linked, Haplotypes, Humans, Models, Genetic, Polymorphism, Genetic, Recombination, Genetic, Silene genetics, Chromosome Mapping methods, Chromosomes, Human genetics, Chromosomes, Plant genetics, Sex Chromosomes genetics
- Abstract
We propose a method, SDpop, able to infer sex-linkage caused by recombination suppression typical of sex chromosomes. The method is based on the modeling of the allele and genotype frequencies of individuals of known sex in natural populations. It is implemented in a hierarchical probabilistic framework, accounting for different sources of error. It allows statistical testing for the presence or absence of sex chromosomes, and detection of sex-linked genes based on the posterior probabilities in the model. Furthermore, for gametologous sequences, the haplotype and level of nucleotide polymorphism of each copy can be inferred, as well as the divergence between them. We test the method using simulated data, as well as data from both a relatively recent and an old sex chromosome system (the plant Silene latifolia and humans) and show that, for most cases, robust predictions are obtained with 5 to 10 individuals per sex., (© The Author(s) 2021. Published by Oxford University Press on behalf of Genetics Society of America. All rights reserved. For permissions, please email: journals.permissions@oup.com.)
- Published
- 2021
- Full Text
- View/download PDF
30. Long-Lived Species of Bivalves Exhibit Low MT-DNA Substitution Rates.
- Author
-
Mortz M, Levivier A, Lartillot N, Dufresne F, and Blier PU
- Abstract
Bivalves represent valuable taxonomic group for aging studies given their wide variation in longevity (from 1-2 to >500 years). It is well known that aging is associated to the maintenance of Reactive Oxygen Species homeostasis and that mitochondria phenotype and genotype dysfunctions accumulation is a hallmark of these processes. Previous studies have shown that mitochondrial DNA mutation rates are linked to lifespan in vertebrate species, but no study has explored this in invertebrates. To this end, we performed a Bayesian Phylogenetic Covariance model of evolution analysis using 12 mitochondrial protein-coding genes of 76 bivalve species. Three life history traits (maximum longevity, generation time and mean temperature tolerance) were tested against 1) synonymous substitution rates (dS), 2) conservative amino acid replacement rates (Kc) and 3) ratios of radical over conservative amino acid replacement rates (Kr/Kc). Our results confirm the already known correlation between longevity and generation time and show, for the first time in an invertebrate class, a significant negative correlation between dS and longevity. This correlation was not as strong when generation time and mean temperature tolerance variations were also considered in our model (marginal correlation), suggesting a confounding effect of these traits on the relationship between longevity and mtDNA substitution rate. By confirming the negative correlation between dS and longevity previously documented in birds and mammals, our results provide support for a general pattern in substitution rates., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2021 Mortz, Levivier, Lartillot, Dufresne and Blier.)
- Published
- 2021
- Full Text
- View/download PDF
31. Publisher Correction: Universal probabilistic programming offers a powerful approach to statistical phylogenetics.
- Author
-
Ronquist F, Kudlicka J, Senderov V, Borgström J, Lartillot N, Lundén D, Murray L, Schön TB, and Broman D
- Published
- 2021
- Full Text
- View/download PDF
32. A Bayesian Mutation-Selection Framework for Detecting Site-Specific Adaptive Evolution in Protein-Coding Genes.
- Author
-
Rodrigue N, Latrille T, and Lartillot N
- Subjects
- Bayes Theorem, Biological Evolution, Genetic Techniques, Models, Genetic, Mutation, Selection, Genetic
- Abstract
In recent years, codon substitution models based on the mutation-selection principle have been extended for the purpose of detecting signatures of adaptive evolution in protein-coding genes. However, the approaches used to date have either focused on detecting global signals of adaptive regimes-across the entire gene-or on contexts where experimentally derived, site-specific amino acid fitness profiles are available. Here, we present a Bayesian site-heterogeneous mutation-selection framework for site-specific detection of adaptive substitution regimes given a protein-coding DNA alignment. We offer implementations, briefly present simulation results, and apply the approach on a few real data sets. Our analyses suggest that the new approach shows greater sensitivity than traditional methods. However, more study is required to assess the impact of potential model violations on the method, and gain a greater empirical sense its behavior on a broader range of real data sets. We propose an outline of such a research program., (© The Author(s) 2020. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.)
- Published
- 2021
- Full Text
- View/download PDF
33. Universal probabilistic programming offers a powerful approach to statistical phylogenetics.
- Author
-
Ronquist F, Kudlicka J, Senderov V, Borgström J, Lartillot N, Lundén D, Murray L, Schön TB, and Broman D
- Subjects
- Animals, Bayes Theorem, Birds genetics, Data Interpretation, Statistical, Models, Statistical, Monte Carlo Method, Probability, Programming Languages, Artificial Intelligence, Biological Evolution, Biostatistics, Birds physiology, Phylogeny, Software
- Abstract
Statistical phylogenetic analysis currently relies on complex, dedicated software packages, making it difficult for evolutionary biologists to explore new models and inference strategies. Recent years have seen more generic solutions based on probabilistic graphical models, but this formalism can only partly express phylogenetic problems. Here, we show that universal probabilistic programming languages (PPLs) solve the expressivity problem, while still supporting automated generation of efficient inference algorithms. To prove the latter point, we develop automated generation of sequential Monte Carlo (SMC) algorithms for PPL descriptions of arbitrary biological diversification (birth-death) models. SMC is a new inference strategy for these problems, supporting both parameter inference and efficient estimation of Bayes factors that are used in model testing. We take advantage of this in automatically generating SMC algorithms for several recent diversification models that have been difficult or impossible to tackle previously. Finally, applying these algorithms to 40 bird phylogenies, we show that models with slowing diversification, constant turnover and many small shifts generally explain the data best. Our work opens up several related problem domains to PPL approaches, and shows that few hurdles remain before these techniques can be effectively applied to the full range of phylogenetic models.
- Published
- 2021
- Full Text
- View/download PDF
34. Scalable Empirical Mixture Models That Account for Across-Site Compositional Heterogeneity.
- Author
-
Schrempf D, Lartillot N, and Szöllősi G
- Subjects
- Cluster Analysis, Amino Acid Substitution, Genetic Techniques, Models, Genetic, Phylogeny, Software
- Abstract
Biochemical demands constrain the range of amino acids acceptable at specific sites resulting in across-site compositional heterogeneity of the amino acid replacement process. Phylogenetic models that disregard this heterogeneity are prone to systematic errors, which can lead to severe long-branch attraction artifacts. State-of-the-art models accounting for across-site compositional heterogeneity include the CAT model, which is computationally expensive, and empirical distribution mixture models estimated via maximum likelihood (C10-C60 models). Here, we present a new, scalable method EDCluster for finding empirical distribution mixture models involving a simple cluster analysis. The cluster analysis utilizes specific coordinate transformations which allow the detection of specialized amino acid distributions either from curated databases or from the alignment at hand. We apply EDCluster to the HOGENOM and HSSP databases in order to provide universal distribution mixture (UDM) models comprising up to 4,096 components. Detailed analyses of the UDM models demonstrate the removal of various long-branch attraction artifacts and improved performance compared with the C10-C60 models. Ready-to-use implementations of the UDM models are provided for three established software packages (IQ-TREE, Phylobayes, and RevBayes)., (© The Author(s) 2020. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.)
- Published
- 2020
- Full Text
- View/download PDF
35. From Inquilines to Gall Inducers: Genomic Signature of a Life-Style Transition in Synergus Gall Wasps.
- Author
-
Gobbo E, Lartillot N, Hearn J, Stone GN, Abe Y, Wheat CW, Ide T, and Ronquist F
- Subjects
- Animals, Gene Duplication, Models, Genetic, Quercus parasitology, Biological Evolution, Genome, Insect, Plant Tumors parasitology, Selection, Genetic, Wasps physiology
- Abstract
Gall wasps (Hymenoptera: Cynipidae) induce complex galls on oaks, roses, and other plants, but the mechanism of gall induction is still unknown. Here, we take a comparative genomic approach to revealing the genetic basis of gall induction. We focus on Synergus itoensis, a species that induces galls inside oak acorns. Previous studies suggested that this species evolved the ability to initiate gall formation recently, as it is deeply nested within the genus Synergus, whose members are mostly inquilines that develop inside the galls of other species. We compared the genome of S. itoensis with that of three related Synergus inquilines to identify genomic changes associated with the origin of gall induction. We used a novel Bayesian selection analysis, which accounts for branch-specific and gene-specific selection effects, to search for signatures of selection in 7,600 single-copy orthologous genes shared by the four Synergus species. We found that the terminal branch leading to S. itoensis had more genes with a significantly elevated dN/dS ratio (positive signature genes) than the other terminal branches in the tree; the S. itoensis branch also had more genes with a significantly decreased dN/dS ratio. Gene set enrichment analysis showed that the positive signature gene set of S. itoensis, unlike those of the inquiline species, is enriched in several biological process Gene Ontology terms, the most prominent of which is "Ovarian Follicle Cell Development." Our results indicate that the origin of gall induction is associated with distinct genomic changes, and provide a good starting point for further characterization of the genes involved., (© The Author(s) 2020. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.)
- Published
- 2020
- Full Text
- View/download PDF
36. Detecting adaptive convergent amino acid evolution.
- Author
-
Rey C, Lanore V, Veber P, Guéguen L, Lartillot N, Sémon M, and Boussau B
- Subjects
- Amino Acids metabolism, Animals, Genomics, Humans, Models, Genetic, Phylogeny, Proteins metabolism, Amino Acids genetics, Evolution, Molecular, Proteins genetics
- Abstract
In evolutionary genomics, researchers have taken an interest in identifying substitutions that subtend convergent phenotypic adaptations. This is a difficult question that requires distinguishing foreground convergent substitutions that are involved in the convergent phenotype from background convergent substitutions. Those may be linked to other adaptations, may be neutral or may be the consequence of mutational biases. Furthermore, there is no generally accepted definition of convergent substitutions. Various methods that use different definitions have been proposed in the literature, resulting in different sets of candidate foreground convergent substitutions. In this article, we first describe the processes that can generate foreground convergent substitutions in coding sequences, separating adaptive from non-adaptive processes. Second, we review methods that have been proposed to detect foreground convergent substitutions in coding sequences and expose the assumptions that underlie them. Finally, we examine their power on simulations of convergent changes-including in the presence of a change in the efficacy of selection-and on empirical alignments. This article is part of the theme issue 'Convergent evolution in the genomics era: new insights and directions'.
- Published
- 2019
- Full Text
- View/download PDF
37. Erratum: Life History Traits Impact the Nuclear Rate of Substitution but Not the Mitochondrial Rate in Isopods.
- Author
-
Saclier N, François CM, Konecny-Dupré L, Lartillot N, Guéguen L, Duret L, Malard F, Douady CJ, and Lefébure T
- Published
- 2019
- Full Text
- View/download PDF
38. Life History Traits Impact the Nuclear Rate of Substitution but Not the Mitochondrial Rate in Isopods.
- Author
-
Saclier N, François CM, Konecny-Dupré L, Lartillot N, Guéguen L, Duret L, Malard F, Douady CJ, and Lefébure T
- Subjects
- Animals, DNA Replication, Ecosystem, Electron Transport, Isopoda metabolism, Isopoda radiation effects, Protein Biosynthesis, Selection, Genetic, Evolution, Molecular, Genome, Mitochondrial, Isopoda genetics, Life History Traits
- Abstract
The rate of molecular evolution varies widely among species. Life history traits (LHTs) have been proposed as a major driver of these variations. However, the relative contribution of each trait is poorly understood. Here, we test the influence of metabolic rate (MR), longevity, and generation time (GT) on the nuclear and mitochondrial synonymous substitution rates using a group of isopod species that have made multiple independent transitions to subterranean environments. Subterranean species have repeatedly evolved a lower MR, a longer lifespan and a longer GT. We assembled the nuclear transcriptomes and the mitochondrial genomes of 13 pairs of closely related isopods, each pair composed of one surface and one subterranean species. We found that subterranean species have a lower rate of nuclear synonymous substitution than surface species whereas the mitochondrial rate remained unchanged. We propose that this decoupling between nuclear and mitochondrial rates comes from different DNA replication processes in these two compartments. In isopods, the nuclear rate is probably tightly controlled by GT alone. In contrast, mitochondrial genomes appear to replicate and mutate at a rate independent of LHTs. These results are incongruent with previous studies, which were mostly devoted to vertebrates. We suggest that this incongruence can be explained by developmental differences between animal clades, with a quiescent period during female gametogenesis in mammals and birds which imposes a nuclear and mitochondrial rate coupling, as opposed to the continuous gametogenesis observed in most arthropods.
- Published
- 2018
- Full Text
- View/download PDF
39. Conditional Approximate Bayesian Computation: A New Approach for Across-Site Dependency in High-Dimensional Mutation-Selection Models.
- Author
-
Laurin-Lemay S, Rodrigue N, Lartillot N, and Philippe H
- Subjects
- Animals, Bayes Theorem, Humans, Mammals genetics, Monte Carlo Method, Evolution, Molecular, Genetic Techniques, Models, Genetic, Mutation, Selection, Genetic
- Abstract
A key question in molecular evolutionary biology concerns the relative roles of mutation and selection in shaping genomic data. Moreover, features of mutation and selection are heterogeneous along the genome and over time. Mechanistic codon substitution models based on the mutation-selection framework are promising approaches to separating these effects. In practice, however, several complications arise, since accounting for such heterogeneities often implies handling models of high dimensionality (e.g., amino acid preferences), or leads to across-site dependence (e.g., CpG hypermutability), making the likelihood function intractable. Approximate Bayesian Computation (ABC) could address this latter issue. Here, we propose a new approach, named Conditional ABC (CABC), which combines the sampling efficiency of MCMC and the flexibility of ABC. To illustrate the potential of the CABC approach, we apply it to the study of mammalian CpG hypermutability based on a new mutation-level parameter implying dependence across adjacent sites, combined with site-specific purifying selection on amino-acids captured by a Dirichlet process. Our proof-of-concept of the CABC methodology opens new modeling perspectives. Our application of the method reveals a high level of heterogeneity of CpG hypermutability across loci and mild heterogeneity across taxonomic groups; and finally, we show that CpG hypermutability is an important evolutionary factor in rendering relative synonymous codon usage. All source code is available as a GitHub repository (https://github.com/Simonll/LikelihoodFreePhylogenetics.git).
- Published
- 2018
- Full Text
- View/download PDF
40. Correction: Molecular adaptation in Rubisco: Discriminating between convergent evolution and positive selection using mechanistic and classical codon models.
- Author
-
Parto S and Lartillot N
- Abstract
[This corrects the article DOI: 10.1371/journal.pone.0192697.].
- Published
- 2018
- Full Text
- View/download PDF
41. Molecular adaptation in Rubisco: Discriminating between convergent evolution and positive selection using mechanistic and classical codon models.
- Author
-
Parto S and Lartillot N
- Subjects
- Carbon Dioxide metabolism, Photosynthesis, Phylogeny, Codon, Ribulose-Bisphosphate Carboxylase metabolism
- Abstract
Rubisco (Ribulose-1, 5-biphosphate carboxylase/oxygenase) is the most important enzyme on earth, catalyzing the first step of photosynthetic CO2 fixation. So, without it, there would be no storing of the sun's energy in plants. Molecular adaptation of Rubisco to C4 photosynthetic pathway has attracted a lot of attention. C4 plants, which comprise less than 5% of land plants, have evolved more efficient photosynthesis compared to C3 plants. Interestingly, a large number of independent transitions from C3 to C4 phenotype have occurred. Each time, the Rubisco enzyme has been subject to similar changes in selective pressure, thus providing an excellent model for convergent evolution at the molecular level. Molecular adaptation is often identified with positive selection and is typically characterized by an elevated ratio of non-synonymous to synonymous substitution rate (dN/dS). However, convergent adaptation is expected to leave a different molecular signature, taking the form of repeated transitions toward identical or similar amino acids. Here, we used a previously introduced codon-based differential-selection model to detect and quantify consistent patterns of convergent adaptation in Rubisco in eudicots. We further contrasted our results with those obtained by classical codon models based on the estimation of dN/dS. We found that the two classes of models tend to select distinct, although overlapping, sets of positions. This discrepancy in the results illustrates the conceptual difference between these models while emphasizing the need to better discriminate between qualitatively different selective regimes, by using a broader class of codon models than those currently considered in molecular evolutionary studies.
- Published
- 2018
- Full Text
- View/download PDF
42. The Red Queen model of recombination hot-spot evolution: a theoretical investigation.
- Author
-
Latrille T, Duret L, and Lartillot N
- Subjects
- Animals, Mice, Models, Genetic, Gene Conversion, Genetic Variation, Recombination, Genetic
- Abstract
In humans and many other species, recombination events cluster in narrow and short-lived hot spots distributed across the genome, whose location is determined by the Zn-finger protein PRDM9. To explain these fast evolutionary dynamics, an intra-genomic Red Queen model has been proposed, based on the interplay between two antagonistic forces: biased gene conversion, mediated by double-strand breaks, resulting in hot-spot extinction, followed by positive selection favouring new PRDM9 alleles recognizing new sequence motifs. Thus far, however, this Red Queen model has not been formalized as a quantitative population-genetic model, fully accounting for the intricate interplay between biased gene conversion, mutation, selection, demography and genetic diversity at the PRDM9 locus. Here, we explore the population genetics of the Red Queen model of recombination. A Wright-Fisher simulator was implemented, allowing exploration of the behaviour of the model (mean equilibrium recombination rate, diversity at the PRDM9 locus or turnover rate) as a function of the parameters (effective population size, mutation and erosion rates). In a second step, analytical results based on self-consistent mean-field approximations were derived, reproducing the scaling relations observed in the simulations. Empirical fit of the model to current data from the mouse suggests both a high mutation rate at PRDM9 and strong biased gene conversion on its targets.This article is part of the themed issue 'Evolutionary causes and consequences of recombination rate variation in sexual organisms'., (© 2017 The Authors.)
- Published
- 2017
- Full Text
- View/download PDF
43. Improved Modeling of Compositional Heterogeneity Supports Sponges as Sister to All Other Animals.
- Author
-
Feuda R, Dohrmann M, Pett W, Philippe H, Rota-Stabelli O, Lartillot N, Wörheide G, and Pisani D
- Subjects
- Animals, Sequence Analysis, Protein, Biological Evolution, Phylogeny, Porifera classification
- Abstract
The relationships at the root of the animal tree have proven difficult to resolve, with the current debate focusing on whether sponges (phylum Porifera) or comb jellies (phylum Ctenophora) are the sister group of all other animals [1-5]. The choice of evolutionary models seems to be at the core of the problem because Porifera tends to emerge as the sister group of all other animals ("Porifera-sister") when site-specific amino acid differences are modeled (e.g., [6, 7]), whereas Ctenophora emerges as the sister group of all other animals ("Ctenophora-sister") when they are ignored (e.g., [8-11]). We show that two key phylogenomic datasets that previously supported Ctenophora-sister [10, 12] display strong heterogeneity in amino acid composition across sites and taxa and that no routinely used evolutionary model can adequately describe both forms of heterogeneity. We show that data-recoding methods [13-15] reduce compositional heterogeneity in these datasets and that models accommodating site-specific amino acid preferences can better describe the recoded datasets. Increased model adequacy is associated with significant topological changes in support of Porifera-sister. Because adequate modeling of the evolutionary process that generated the data is fundamental to recovering an accurate phylogeny [16-20], our results strongly support sponges as the sister group of all other animals and provide further evidence that Ctenophora-sister represents a tree reconstruction artifact. VIDEO ABSTRACT., (Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.)
- Published
- 2017
- Full Text
- View/download PDF
44. Detecting consistent patterns of directional adaptation using differential selection codon models.
- Author
-
Parto S and Lartillot N
- Subjects
- Amino Acid Sequence, Evolution, Molecular, HIV physiology, Humans, Mutation, Open Reading Frames, Phylogeny, Selection, Genetic, gag Gene Products, Human Immunodeficiency Virus chemistry, gag Gene Products, Human Immunodeficiency Virus genetics, Bayes Theorem, Codon, HIV genetics, Models, Genetic
- Abstract
Background: Phylogenetic codon models are often used to characterize the selective regimes acting on protein-coding sequences. Recent methodological developments have led to models explicitly accounting for the interplay between mutation and selection, by modeling the amino acid fitness landscape along the sequence. However, thus far, most of these models have assumed that the fitness landscape is constant over time. Fluctuations of the fitness landscape may often be random or depend on complex and unknown factors. However, some organisms may be subject to systematic changes in selective pressure, resulting in reproducible molecular adaptations across independent lineages subject to similar conditions., Results: Here, we introduce a codon-based differential selection model, which aims to detect and quantify the fine-grained consistent patterns of adaptation at the protein-coding level, as a function of external conditions experienced by the organism under investigation. The model parameterizes the global mutational pressure, as well as the site- and condition-specific amino acid selective preferences. This phylogenetic model is implemented in a Bayesian MCMC framework. After validation with simulations, we applied our method to a dataset of HIV sequences from patients with known HLA genetic background. Our differential selection model detects and characterizes differentially selected coding positions specifically associated with two different HLA alleles., Conclusion: Our differential selection model is able to identify consistent molecular adaptations as a function of repeated changes in the environment of the organism. These models can be applied to many other problems, ranging from viral adaptation to evolution of life-history strategies in plants or animals.
- Published
- 2017
- Full Text
- View/download PDF
45. Detecting Adaptation in Protein-Coding Genes Using a Bayesian Site-Heterogeneous Mutation-Selection Codon Substitution Model.
- Author
-
Rodrigue N and Lartillot N
- Subjects
- Amino Acids genetics, Bayes Theorem, Computer Simulation, Epistasis, Genetic, Evolution, Molecular, Genetic Heterogeneity, Mutation, Mutation Rate, Phylogeny, Adaptation, Biological genetics, Amino Acid Substitution, Codon, Models, Genetic, Selection, Genetic genetics
- Abstract
Codon substitution models have traditionally attempted to uncover signatures of adaptation within protein-coding genes by contrasting the rates of synonymous and non-synonymous substitutions. Another modeling approach, known as the mutation-selection framework, attempts to explicitly account for selective patterns at the amino acid level, with some approaches allowing for heterogeneity in these patterns across codon sites. Under such a model, substitutions at a given position occur at the neutral or nearly neutral rate when they are synonymous, or when they correspond to replacements between amino acids of similar fitness; substitutions from high to low (low to high) fitness amino acids have comparatively low (high) rates. Here, we study the use of such a mutation-selection framework as a null model for the detection of adaptation. Following previous works in this direction, we include a deviation parameter that has the effect of capturing the surplus, or deficit, in non-synonymous rates, relative to what would be expected under a mutation-selection modeling framework that includes a Dirichlet process approach to account for across-codon-site variation in amino acid fitness profiles. We use simulations, along with a few real data sets, to study the behavior of the approach, and find it to have good power with a low false-positive rate. Altogether, we emphasize the potential of recent mutation-selection models in the detection of adaptation, calling for further model refinements as well as large-scale applications., (© The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.)
- Published
- 2017
- Full Text
- View/download PDF
46. Closing the gap between rocks and clocks using total-evidence dating.
- Author
-
Ronquist F, Lartillot N, and Phillips MJ
- Subjects
- Animals, Calibration, Evolution, Molecular, Phylogeny, Time, Biological Evolution, Fossils anatomy & histology, Mammals anatomy & histology, Mammals genetics
- Abstract
Total-evidence dating (TED) allows evolutionary biologists to incorporate a wide range of dating information into a unified statistical analysis. One might expect this to improve the agreement between rocks and clocks but this is not necessarily the case. We explore the reasons for such discordance using a mammalian dataset with rich molecular, morphological and fossil information. There is strong conflict in this dataset between morphology and molecules under standard stochastic models. This causes TED to push divergence events back in time when using inadequate models or vague priors, a phenomenon we term 'deep root attraction' (DRA). We identify several causes of DRA. Failure to account for diversified sampling results in dramatic DRA, but this can be addressed using existing techniques. Inadequate morphological models also appear to be a major contributor to DRA. The major reason seems to be that current models do not account for dependencies among morphological characters, causing distorted topology and branch length estimates. This is particularly problematic for huge morphological datasets, which may contain large numbers of correlated characters. Finally, diversification and fossil sampling priors that do not incorporate all the available background information can contribute to DRA, but these priors can also be used to compensate for DRA. Specifically, we show that DRA in the mammalian dataset can be addressed by introducing a modest extra penalty for ghost lineages that are unobserved in the fossil record, for instance by assuming rapid diversification, rare extinction or high fossil sampling rate; any of these assumptions produces highly congruent divergence time estimates with a minimal gap between rocks and clocks. Under these conditions, fossils have a stabilizing influence on divergence time estimates and significantly increase the precision of those estimates, which are generally close to the dates suggested by palaeontologists.This article is part of the themed issue 'Dating species divergences using rocks and clocks'., (© 2016 The Authors.)
- Published
- 2016
- Full Text
- View/download PDF
47. A mixed relaxed clock model.
- Author
-
Lartillot N, Phillips MJ, and Ronquist F
- Subjects
- Animals, Bayes Theorem, Time Factors, Evolution, Molecular, Fossils anatomy & histology, Mammals genetics, Phylogeny
- Abstract
Over recent years, several alternative relaxed clock models have been proposed in the context of Bayesian dating. These models fall in two distinct categories: uncorrelated and autocorrelated across branches. The choice between these two classes of relaxed clocks is still an open question. More fundamentally, the true process of rate variation may have both long-term trends and short-term fluctuations, suggesting that more sophisticated clock models unfolding over multiple time scales should ultimately be developed. Here, a mixed relaxed clock model is introduced, which can be mechanistically interpreted as a rate variation process undergoing short-term fluctuations on the top of Brownian long-term trends. Statistically, this mixed clock represents an alternative solution to the problem of choosing between autocorrelated and uncorrelated relaxed clocks, by proposing instead to combine their respective merits. Fitting this model on a dataset of 105 placental mammals, using both node-dating and tip-dating approaches, suggests that the two pure clocks, Brownian and white noise, are rejected in favour of a mixed model with approximately equal contributions for its uncorrelated and autocorrelated components. The tip-dating analysis is particularly sensitive to the choice of the relaxed clock model. In this context, the classical pure Brownian relaxed clock appears to be overly rigid, leading to biases in divergence time estimation. By contrast, the use of a mixed clock leads to more recent and more reasonable estimates for the crown ages of placental orders and superorders. Altogether, the mixed clock introduced here represents a first step towards empirically more adequate models of the patterns of rate variation across phylogenetic trees.This article is part of the themed issue 'Dating species divergences using rocks and clocks'., (© 2016 The Authors.)
- Published
- 2016
- Full Text
- View/download PDF
48. RevBayes: Bayesian Phylogenetic Inference Using Graphical Models and an Interactive Model-Specification Language.
- Author
-
Höhna S, Landis MJ, Heath TA, Boussau B, Lartillot N, Moore BR, Huelsenbeck JP, and Ronquist F
- Subjects
- Bayes Theorem, Classification methods, Models, Biological, Phylogeny, Software
- Abstract
Programs for Bayesian inference of phylogeny currently implement a unique and fixed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be specified interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-specification language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous flexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our field. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.]., (© The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.)
- Published
- 2016
- Full Text
- View/download PDF
49. Reply to Halanych et al.: Ctenophore misplacement is corroborated by independent datasets.
- Author
-
Pisani D, Pett W, Dohrmann M, Feuda R, Rota-Stabelli O, Philippe H, Lartillot N, and Wörheide G
- Subjects
- Animals, Ctenophora classification, Ctenophora genetics, Databases, Genetic, Genome
- Published
- 2016
- Full Text
- View/download PDF
50. Genomic data do not support comb jellies as the sister group to all other animals.
- Author
-
Pisani D, Pett W, Dohrmann M, Feuda R, Rota-Stabelli O, Philippe H, Lartillot N, and Wörheide G
- Subjects
- Animals, Bayes Theorem, Bias, Likelihood Functions, Models, Genetic, Phylogeny, Reproducibility of Results, Selection, Genetic, Ctenophora classification, Ctenophora genetics, Databases, Genetic, Genome
- Abstract
Understanding how complex traits, such as epithelia, nervous systems, muscles, or guts, originated depends on a well-supported hypothesis about the phylogenetic relationships among major animal lineages. Traditionally, sponges (Porifera) have been interpreted as the sister group to the remaining animals, a hypothesis consistent with the conventional view that the last common animal ancestor was relatively simple and more complex body plans arose later in evolution. However, this premise has recently been challenged by analyses of the genomes of comb jellies (Ctenophora), which, instead, found ctenophores as the sister group to the remaining animals (the "Ctenophora-sister" hypothesis). Because ctenophores are morphologically complex predators with true epithelia, nervous systems, muscles, and guts, this scenario implies these traits were either present in the last common ancestor of all animals and were lost secondarily in sponges and placozoans (Trichoplax) or, alternatively, evolved convergently in comb jellies. Here, we analyze representative datasets from recent studies supporting Ctenophora-sister, including genome-scale alignments of concatenated protein sequences, as well as a genomic gene content dataset. We found no support for Ctenophora-sister and conclude it is an artifact resulting from inadequate methodology, especially the use of simplistic evolutionary models and inappropriate choice of species to root the metazoan tree. Our results reinforce a traditional scenario for the evolution of complexity in animals, and indicate that inferences about the evolution of Metazoa based on the Ctenophora-sister hypothesis are not supported by the currently available data.
- Published
- 2015
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.