10 results on '"Creevey CJ"'
Search Results
2. Inadvertent Paralog Inclusion Drives Artifactual Topologies and Timetree Estimates in Phylogenomics.
- Author
-
Siu-Ting K, Torres-Sánchez M, San Mauro D, Wilcockson D, Wilkinson M, Pisani D, O'Connell MJ, and Creevey CJ
- Subjects
- Amphibians genetics, Animals, Gene Duplication, Genetic Techniques, Phylogeny, Transcriptome
- Abstract
Increasingly, large phylogenomic data sets include transcriptomic data from nonmodel organisms. This not only has allowed controversial and unexplored evolutionary relationships in the tree of life to be addressed but also increases the risk of inadvertent inclusion of paralogs in the analysis. Although this may be expected to result in decreased phylogenetic support, it is not clear if it could also drive highly supported artifactual relationships. Many groups, including the hyperdiverse Lissamphibia, are especially susceptible to these issues due to ancient gene duplication events and small numbers of sequenced genomes and because transcriptomes are increasingly applied to resolve historically conflicting taxonomic hypotheses. We tested the potential impact of paralog inclusion on the topologies and timetree estimates of the Lissamphibia using published and de novo sequencing data including 18 amphibian species, from which 2,656 single-copy gene families were identified. A novel paralog filtering approach resulted in four differently curated data sets, which were used for phylogenetic reconstructions using Bayesian inference, maximum likelihood, and quartet-based supertrees. We found that paralogs drive strongly supported conflicting hypotheses within the Lissamphibia (Batrachia and Procera) and older divergence time estimates even within groups where no variation in topology was observed. All investigated methods, except Bayesian inference with the CAT-GTR model, were found to be sensitive to paralogs, but with filtering convergence to the same answer (Batrachia) was observed. This is the first large-scale study to address the impact of orthology selection using transcriptomic data and emphasizes the importance of quality over quantity particularly for understanding relationships of poorly sampled taxa., (© The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.)
- Published
- 2019
- Full Text
- View/download PDF
3. Concatabominations: identifying unstable taxa in morphological phylogenetics using a heuristic extension to safe taxonomic reduction.
- Author
-
Siu-Ting K, Pisani D, Creevey CJ, and Wilkinson M
- Subjects
- Animals, Classification methods, Phylogeny
- Published
- 2015
- Full Text
- View/download PDF
4. Identifying single copy orthologs in Metazoa.
- Author
-
Creevey CJ, Muller J, Doerks T, Thompson JD, Arendt D, and Bork P
- Subjects
- Animals, Databases, Genetic, Evolution, Molecular, Expressed Sequence Tags, Humans, Multigene Family, Gene Dosage, Genome genetics, Genomics methods, Phylogeny
- Abstract
The identification of single copy (1-to-1) orthologs in any group of organisms is important for functional classification and phylogenetic studies. The Metazoa are no exception, but only recently has there been a wide-enough distribution of taxa with sufficiently high quality sequenced genomes to gain confidence in the wide-spread single copy status of a gene.Here, we present a phylogenetic approach for identifying overlooked single copy orthologs from multigene families and apply it to the Metazoa. Using 18 sequenced metazoan genomes of high quality we identified a robust set of 1,126 orthologous groups that have been retained in single copy since the last common ancestor of Metazoa. We found that the use of the phylogenetic procedure increased the number of single copy orthologs found by over a third more than standard taxon-count approaches. The orthologs represented a wide range of functional categories, expression profiles and levels of divergence.To demonstrate the value of our set of single copy orthologs, we used them to assess the completeness of 24 currently published metazoan genomes and 62 EST datasets. We found that the annotated genes in published genomes vary in coverage from 79% (Ciona intestinalis) to 99.8% (human) with an average of 92%, suggesting a value for the underlying error rate in genome annotation, and a strategy for identifying single copy orthologs in larger datasets. In contrast, the vast majority of EST datasets with no corresponding genome sequence available are largely under-sampled and probably do not accurately represent the actual genomic complement of the organisms from which they are derived., (© 2011 Creevey et al.)
- Published
- 2011
- Full Text
- View/download PDF
5. Trees from trees: construction of phylogenetic supertrees using clann.
- Author
-
Creevey CJ and McInerney JO
- Subjects
- Algorithms, Computational Biology methods, Phylogeny, Software
- Abstract
Supertree methods combine multiple phylogenetic trees to produce the overall best "supertree." They can be used to combine phylogenetic information from datasets only partially overlapping and from disparate sources (like molecular and morphological data), or to break down problems thought to be computationally intractable. Some of the longest standing phylogenetic conundrums are now being brought to light using supertree approaches. We describe the most widely used supertree methods implemented in the software program "clann" and provide a step by step tutorial for investigating phylogenetic information and reconstructing the best supertree. Clann is freely available for Windows, Mac and Unix/Linux operating systems under the GNU public licence at (http://bioinf.nuim.ie/software/clann).
- Published
- 2009
- Full Text
- View/download PDF
6. Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justified.
- Author
-
Keane TM, Creevey CJ, Pentony MM, Naughton TJ, and Mclnerney JO
- Subjects
- Animals, Archaea chemistry, Archaea genetics, Likelihood Functions, Markov Chains, Models, Genetic, Proteins chemistry, Proteins genetics, Proteobacteria chemistry, Proteobacteria genetics, Reproducibility of Results, Sequence Alignment, Vertebrates genetics, Amino Acid Substitution genetics, Computational Biology methods, Databases, Genetic, Evolution, Molecular, Phylogeny
- Abstract
Background: In recent years, model based approaches such as maximum likelihood have become the methods of choice for constructing phylogenies. A number of authors have shown the importance of using adequate substitution models in order to produce accurate phylogenies. In the past, many empirical models of amino acid substitution have been derived using a variety of different methods and protein datasets. These matrices are normally used as surrogates, rather than deriving the maximum likelihood model from the dataset being examined. With few exceptions, selection between alternative matrices has been carried out in an ad hoc manner., Results: We start by highlighting the potential dangers of arbitrarily choosing protein models by demonstrating an empirical example where a single alignment can produce two topologically different and strongly supported phylogenies using two different arbitrarily-chosen amino acid substitution models. We demonstrate that in simple simulations, statistical methods of model selection are indeed robust and likely to be useful for protein model selection. We have investigated patterns of amino acid substitution among homologous sequences from the three Domains of life and our results show that no single amino acid matrix is optimal for any of the datasets. Perhaps most interestingly, we demonstrate that for two large datasets derived from the proteobacteria and archaea, one of the most favored models in both datasets is a model that was originally derived from retroviral Pol proteins., Conclusion: This demonstrates that choosing protein models based on their source or method of construction may not be appropriate.
- Published
- 2006
- Full Text
- View/download PDF
7. Toward automatic reconstruction of a highly resolved tree of life.
- Author
-
Ciccarelli FD, Doerks T, von Mering C, Creevey CJ, Snel B, and Bork P
- Subjects
- Amino Acyl-tRNA Synthetases genetics, Animals, Archaea genetics, Bacteria genetics, Biological Evolution, Computational Biology, Eukaryotic Cells, Gene Transfer, Horizontal, Invertebrates genetics, Plants genetics, Protein Biosynthesis, Ribosomal Proteins genetics, Vertebrates genetics, Archaea classification, Bacteria classification, Genome, Invertebrates classification, Phylogeny, Plants classification, Vertebrates classification
- Abstract
We have developed an automatable procedure for reconstructing the tree of life with branch lengths comparable across all three domains. The tree has its basis in a concatenation of 31 orthologs occurring in 191 species with sequenced genomes. It revealed interdomain discrepancies in taxonomic classification. Systematic detection and subsequent exclusion of products of horizontal gene transfer increased phylogenetic resolution, allowing us to confirm accepted relationships and resolve disputed and preliminary classifications. For example, we place the phylum Acidobacteria as a sister group of delta-Proteobacteria, support a Gram-positive origin of Bacteria, and suggest a thermophilic last universal common ancestor.
- Published
- 2006
- Full Text
- View/download PDF
8. Genome phylogenies indicate a meaningful alpha-proteobacterial phylogeny and support a grouping of the mitochondria with the Rickettsiales.
- Author
-
Fitzpatrick DA, Creevey CJ, and McInerney JO
- Subjects
- Base Sequence, Bayes Theorem, Computational Biology, Gene Transfer, Horizontal genetics, Models, Genetic, Sequence Alignment, Species Specificity, Alphaproteobacteria genetics, Evolution, Molecular, Genome, Bacterial genetics, Mitochondria genetics, Phylogeny
- Abstract
Placement of the mitochondrial branch on the tree of life has been problematic. Sparse sampling, the uncertainty of how lateral gene transfer might overwrite phylogenetic signals, and the uncertainty of phylogenetic inference have all contributed to the issue. Here we address this issue using a supertree approach and completed genomic sequences. We first determine that a sensible alpha-proteobacterial phylogenetic tree exists and that it can confidently be inferred using orthologous genes. We show that congruence across these orthologous gene trees is significantly better than might be expected by random chance. There is some evidence of horizontal gene transfer within the alpha-proteobacteria, but it appears to be restricted to a minority of genes ( approximately 23%) most of whom ( approximately 74%) can be categorized as operational. This means that placement of the mitochondrion should not be excessively hampered by interspecies gene transfer. We then show that there is a consistently strong signal for placement of the mitochondrion on this tree and that this placement is relatively insensitive to methodological approach or data set. A concatenated alignment was created consisting of 15 mitochondrion-encoded proteins that are unlikely to have undergone any lateral gene transfer in the timeline under consideration. This alignment infers that the sister group of the mitochondria, for the taxa that have been sampled, is the order Rickettsiales.
- Published
- 2006
- Full Text
- View/download PDF
9. The Opisthokonta and the Ecdysozoa may not be clades: stronger support for the grouping of plant and animal than for animal and fungi and stronger support for the Coelomata than Ecdysozoa.
- Author
-
Philip GK, Creevey CJ, and McInerney JO
- Subjects
- Animals, Drosophila Proteins genetics, Fungal Proteins genetics, Introns genetics, Plant Proteins genetics, Animal Population Groups genetics, Evolution, Molecular, Fungi genetics, Genome, Phylogeny, Plants genetics, Ribosomal Proteins genetics
- Abstract
In considering the best possible solutions for answering phylogenetic questions from genomic sequences, we have chosen a strategy that we suggest is superior to others that have gone previously. We have ignored multigene families and instead have used single-gene families. This minimizes the inadvertent analysis of paralogs. We have employed strict data controls and have reasoned that if a protein is not capable of recovering the uncontroversial parts of a phylogenetic tree, then why should we use it for the more controversial parts? We have sliced and diced the data in as many ways as possible in order to uncover the signals in that data. Using this strategy, we have tested two controversial hypotheses concerning eukaryotic phylogenetic relationships: the placement of arthropoda and nematodes and the relationships of animals, plants, and fungi. We have constructed phylogenetic trees from 780 single-gene families from 10 completed genomes and amalgamated these into a single supertree. We have also carried out a total evidence analysis on the only universally distributed protein families that can accurately reconstruct the uncontroversial parts of the phylogenetic tree: a total of five families. In doing so, we ignore the majority of single-gene families that are universally distributed as they do not have the appropriate signals to recover the uncontroversial parts of the tree. We have also ignored every protein that has ever been used previously to address this issue, simply because none of them meet our strict criteria. Using these data controls, site stripping, and multiple analyses, 24 out of 26 analyses strongly support the grouping of vertebrates with arthropods (Coelomata hypothesis) and plants with animals. In the other two analyses, the data were ambivalent. The latter finding overturns an 11-year theory of Eukaryotic evolution; the first confirms what has already been said by others. In the light of this new tree, we re-analyze the evolution of intron gain and loss in the rpL14 gene and find that it is much more compatible with the hypothesis presented here than with the Opisthokonta hypothesis.
- Published
- 2005
- Full Text
- View/download PDF
10. Does a tree-like phylogeny only exist at the tips in the prokaryotes?
- Author
-
Creevey CJ, Fitzpatrick DA, Philip GK, Kinsella RJ, O'Connell MJ, Pentony MM, Travers SA, Wilkinson M, and McInerney JO
- Subjects
- Likelihood Functions, Models, Genetic, Bacteria genetics, Classification methods, Gene Transfer, Horizontal genetics, Genome, Bacterial, Phylogeny
- Abstract
The extent to which prokaryotic evolution has been influenced by horizontal gene transfer (HGT) and therefore might be more of a network than a tree is unclear. Here we use supertree methods to ask whether a definitive prokaryotic phylogenetic tree exists and whether it can be confidently inferred using orthologous genes. We analysed an 11-taxon dataset spanning the deepest divisions of prokaryotic relationships, a 10-taxon dataset spanning the relatively recent gamma-proteobacteria and a 61-taxon dataset spanning both, using species for which complete genomes are available. Congruence among gene trees spanning deep relationships is not better than random. By contrast, a strong, almost perfect phylogenetic signal exists in gamma-proteobacterial genes. Deep-level prokaryotic relationships are difficult to infer because of signal erosion, systematic bias, hidden paralogy and/or HGT. Our results do not preclude levels of HGT that would be inconsistent with the notion of a prokaryotic phylogeny. This approach will help decide the extent to which we can say that there is a prokaryotic phylogeny and where in the phylogeny a cohesive genomic signal exists.
- Published
- 2004
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.