15 results on '"Sayyari, E."'
Search Results
2. One thousand plant transcriptomes and the phylogenomics of green plants
- Author
-
Leebens, Mack J.H., Graham, S.W, Wong, G.K-S., DeGironimo, L., Edger, P.P., Jordon-Thaden, I.E., Joya, S., Melkonian, B., Miles, N.W., Pokorny Montero, L., Quigley, C., Thomas, P., Villarreal, J.C., Augustin, M.M., Barrett, M.D., Baucom, R.S., Beerling, D.J., Benstein, R.M., Biffin, E., Brockington, S.F., Burge, D.O., Burris, J.N., Burris, K.P., Burtet-Sarramegna, V., Caicedo, A.L., Cannon, S.B., Çebi, Z., Chang, Y., Chater, C., Cheeseman, J.M., Chen, T., Clarke, N.D., Clayton, H., Covshoff, S., Crandall-Stotler, B.J., Cross, H., Determann, R., Dickson, R.C., Di Stilio, V.S., Ellis, S., Fast, E., Feja, N., Field, K.J., Filatov, D.A., Finnegan, P.M., Floyd, S.K., Fogliani, B., GarcÍa, N., Gâteblé, G., Godden, G.T., Goh, Q., Greiner, S., Harkess, A., Heaney, Mike J., Helliwell, K.E., Heyduk, K., Hibberd, J.M., Hodel, R.G.J., Hollingsworth, P.M., Johnson, M.T.J., Jost, R., Joyce, B., Kapralov, M.V., Kazamia, E., Kellogg, E.A., Koch, M.A., Von Konrat, M., Könyves, K., Kutchan, T.M., Lam, V., Larsson, A., Leitch, A.R., Lentz, R., Li, F.-W., Lowe, A.J., Ludwig, M., Manos, P.S., Mavrodiev, E., McCormick, M.K., McKain, M, McLellan, T., McNeal, J., Miller, R., Nelson, M.N., Peng, Y., Ralph, P., Real, D., Riggins, C.W., Ruhsam, M., Sage, R.F., Sakai, A.K., Scascitella, M., Schilling, E.E., Schlösser, E., Sederoff, H., Servick, S., Shaw, A.J., Shaw, S.W., Sigel, E.M., Skema, C., Smith, A.G., Smithson, A., NeilStewart, C., Stinchcombe, J.R., Szövényi, P., Tate, J.A., Tiebel, H., Trapnell, D., Villegente, M., Wang, C., Weller, S.G., Wenzel, M., Weststrand, S., Westwood, J.H., Whigham, D.F., Wulff, A.S., Yang, Y., Zhu, D., Zhuang, C., Zuidof, J., Chase, M.W., Deyholos, M.K., Graham, S.W., Pires, J. Chris, Rothfels, C.J., Chen, C., Chen, L., Cheng, S., Li, J., Li, R., Li, X., Lu, H., Ou, Y., Tan, X., Tang, J., Tian, Z., Wang, F., Wang, J., Wei, X., Wong, G. K.-S., Xu, X., Yan, Z., Yang, F., Zhong, X., Zhou, F., Zhu, Y., Zhang, Y., Yu, J., Barkman, T. J., Carpenter, E. J., Liu, T., Sun, X., Wu, S., Mirarab, S., Nguyen, N., Gitzendanner, M. A., Ayyampalayam, S., Der, J., Matasci, N., Sayyari, E., Soltis, D. E., Soltis, P. S., Stevenson, D. W., Wafula, E. K., Walls, R., Wickett, N. J., De Pamphilis, C. W., Graham, S. W, Leebens-Mack, J. H., Warnow, T., Li, Z., An, H., Arrigo, N., Baniaga, A. E., Galuska, S., Jorgensen, S. A., Kidder, T. I., Kong, H., Lu-Irving, P., Marx, H. E., Qi, X., Reardon, C. R., Sessa, E. B., Sutherland, B. L., Tiley, G. P., Welles, S. R., Yu, R., Zhan, S., Barker, M. S., Porsch, M., Ullrich, K. K., Gramzow, L., Melkonian, M., Nelson, D. R., Theißen, G., Wong, G. K. S., Grosse, I., Rensing, S. A., Quint, M., Institut de sciences exactes et appliquées (ISEA), Université de la Nouvelle-Calédonie (UNC), Apollo - University of Cambridge Repository, National Key Research and Development Program (China), Ministry of Science and Technology of the People's Republic of China, Filatov, D, One Thousand Plant Transcriptomes Initiative, and School of Plant and Environmental Sciences
- Subjects
0106 biological sciences ,Genome evolution ,Nuclear gene ,631/208/212/2306 ,[SDV]Life Sciences [q-bio] ,Viridiplantae ,01 natural sciences ,Genome ,Article ,Evolution, Molecular ,03 medical and health sciences ,Plant evolution ,Phylogenomics ,Databases, Genetic ,Glaucophyta ,631/449/2669 ,631/181/735 ,Plastid ,Phylogeny ,45/90 ,030304 developmental biology ,45/91 ,Adaptive radiation ,0303 health sciences ,631/181/759/2467 ,Multidisciplinary ,biology ,Archaeplastida ,fungi ,Botany ,food and beverages ,Botanik ,15. Life on land ,biology.organism_classification ,Biological Evolution ,Evolutionary biology ,Molecular evolution ,Transcriptome ,Genome, Plant ,010606 plant biology & botany - Abstract
Green plants (Viridiplantae) include around 450,000–500,000 species1,2 of great diversity and have important roles in terrestrial and aquatic ecosystems. Here, as part of the One Thousand Plant Transcriptomes Initiative, we sequenced the vegetative transcriptomes of 1,124 species that span the diversity of plants in a broad sense (Archaeplastida), including green plants (Viridiplantae), glaucophytes (Glaucophyta) and red algae (Rhodophyta). Our analysis provides a robust phylogenomic framework for examining the evolution of green plants. Most inferred species relationships are well supported across multiple species tree and supermatrix analyses, but discordance among plastid and nuclear gene trees at a few important nodes highlights the complexity of plant genome evolution, including polyploidy, periods of rapid speciation, and extinction. Incomplete sorting of ancestral variation, polyploidization and massive expansions of gene families punctuate the evolutionary history of green plants. Notably, we find that large expansions of gene families preceded the origins of green plants, land plants and vascular plants, whereas whole-genome duplications are inferred to have occurred repeatedly throughout the evolution of flowering plants and ferns. The increasing availability of high-quality plant genome sequences and advances in functional genomics are enabling research on genome evolution across the green tree of life., The One Thousand Plant Transcriptomes Initiative provides a robust phylogenomic framework for examining green plant evolution that comprises the transcriptomes and genomes of diverse species of green plants.
- Published
- 2019
- Full Text
- View/download PDF
3. Development and extensive sequencing of a broadly-consented Genome in a Bottle matched tumor-normal pair.
- Author
-
McDaniel JH, Patel V, Olson ND, He HJ, He Z, Cole KD, Schmitt A, Sikkink K, Sedlazeck FJ, Doddapaneni H, Jhangiani SN, Muzny DM, Gingras MC, Mehta H, Paulin LF, Hastie AR, Yu HC, Weigman V, Rojas A, Kennedy K, Remington J, Gonzalez I, Sudkamp M, Wiseman K, Lajoie BR, Levy S, Jain M, Akeson S, Narzisi G, Steinsnyder Z, Reeves C, Shelton J, Kingan SB, Lambert C, Bayabyan P, Wenger AM, McLaughlin IJ, Adamson A, Kingsley C, Wescott M, Kim Y, Paten B, Park J, Violich I, Miga KH, Gardner J, McNulty B, Rosen G, McCoy R, Brundu F, Sayyari E, Scheffler K, Truong S, Catreux S, Hannah LC, Lipson D, Benjamin H, Iremadze N, Soifer I, Eacker S, Wood M, Cross E, Husar G, Gross S, Vernich M, Kolmogorov M, Ahmad T, Keskus A, Bryant A, Thibaud-Nissen F, Trow J, Proszynski J, Hirschberg JW, Ryon K, Mason CE, Wagner J, Xiao C, Liss AS, and Zook JM
- Abstract
The Genome in a Bottle Consortium (GIAB), hosted by the National Institute of Standards and Technology (NIST), is developing new matched tumor-normal samples, the first to be explicitly consented for public dissemination of genomic data and cell lines. Here, we describe a comprehensive genomic dataset from the first individual, HG008, including DNA from an adherent, epithelial-like pancreatic ductal adenocarcinoma (PDAC) tumor cell line and matched normal cells from duodenal and pancreatic tissues. Data for the tumor-normal matched samples comes from thirteen distinct state-of-the-art whole genome measurement technologies, including high depth short and long-read bulk whole genome sequencing (WGS), single cell WGS, and Hi-C, and karyotyping. These data will be used by the GIAB Consortium to develop matched tumor-normal benchmarks for somatic variant detection. We expect these data to facilitate innovation for whole genome measurement technologies, de novo assembly of tumor and normal genomes, and bioinformatic tools to identify small and structural somatic mutations. This first-of-its-kind broadly consented open-access resource will facilitate further understanding of sequencing methods used for cancer biology., Competing Interests: Competing interests A.S. and K.S. are employees of Arima Genomics. L.F.P. from BCM, was sponsored by Genentech Inc until September 2023. F.J.S from BCM, received research support from Illumina, ONT and Pacbio. A.R.H and H-C.Y. are employees of Bionano Genomics and own stock shares and options of Bionano Genomics, Inc. V.W., K.K., J.R., and I.G. are employees of BioSkryb Genomics. M.S., K.B., B.R.L. and S.L. are employees of Element Biosciences. S.B.K., C.L., P.B., A.M.W., I.J.M., A.A., C.K., M.W., and Y.K. are employees and shareholders of PacBio, Inc. D.L., H.B., N.I., and I.S. are employees and shareholders of Ultima Genomics. S.E. and M.W. are employees of Phase Genomics. E.C., G.H., S.G., and M.V. are employees of KromaTiD, Inc, E.C. is also a shareholder. F.B., E.S., K.S., S.T. and S.C. are employees of Illumina, Inc. All other authors have no competing interests.
- Published
- 2024
- Full Text
- View/download PDF
4. EMPress Enables Tree-Guided, Interactive, and Exploratory Analyses of Multi-omic Data Sets.
- Author
-
Cantrell K, Fedarko MW, Rahman G, McDonald D, Yang Y, Zaw T, Gonzalez A, Janssen S, Estaki M, Haiminen N, Beck KL, Zhu Q, Sayyari E, Morton JT, Armstrong G, Tripathi A, Gauglitz JM, Marotz C, Matteson NL, Martino C, Sanders JG, Carrieri AP, Song SJ, Swafford AD, Dorrestein PC, Andersen KG, Parida L, Kim HC, Vázquez-Baeza Y, and Knight R
- Abstract
Standard workflows for analyzing microbiomes often include the creation and curation of phylogenetic trees. Here we present EMPress, an interactive web tool for visualizing trees in the context of microbiome, metabolome, and other community data scalable to trees with well over 500,000 nodes. EMPress provides novel functionality-including ordination integration and animations-alongside many standard tree visualization features and thus simplifies exploratory analyses of many forms of 'omic data. IMPORTANCE Phylogenetic trees are integral data structures for the analysis of microbial communities. Recent work has also shown the utility of trees constructed from certain metabolomic data sets, further highlighting their importance in microbiome research. The ever-growing scale of modern microbiome surveys has led to numerous challenges in visualizing these data. In this paper we used five diverse data sets to showcase the versatility and scalability of EMPress, an interactive web visualization tool. EMPress addresses the growing need for exploratory analysis tools that can accommodate large, complex multi-omic data sets., (Copyright © 2021 Cantrell et al.)
- Published
- 2021
- Full Text
- View/download PDF
5. More is needed-Thousands of loci are required to elucidate the relationships of the 'flowers of the sea' (Sabellida, Annelida).
- Author
-
Tilic E, Sayyari E, Stiller J, Mirarab S, and Rouse GW
- Subjects
- Animals, Annelida classification, Data Analysis, Likelihood Functions, Species Specificity, Transcriptome genetics, Annelida genetics, Genetic Loci, Phylogeny
- Abstract
Sabellida is a well-known clade containing tube-dwelling annelid worms with a radiolar crown. Iterative phylogenetic analyses over three decades have resulted in three main clades being recognized; Fabriciidae, Serpulidae and Sabellidae, with Fabriciidae proposed as the sister group to Serpulidae. However, relationships within Sabellidae have remained poorly understood, with a proliferation of genera. In order to obtain a robust phylogeny with optimal support, we conducted a large-scale phylogenomic analysis with 19 new sabellid transcriptomes for a total of 21 species. In contrast to earlier findings based on limited DNA data, our results support the position of Fabriciidae as sister taxon to a Sabellidae + Serpulidae clade. Our large sampling within Sabellidae also allows us to establish a stable phylogeny within this clade. We restrict Sabellinae to a subclade of Sabellidae and broaden the previously monotypic Myxicolinae to include Amphicorina and Chone. We tested the robustness of species tree reconstruction by subsampling increasing numbers of genes to uncover hidden support of alternative topologies. Our results show that inclusion of more genes leads to a more stable topology with higher support, and also that including higher divergence genes leads to stronger resolution., (Copyright © 2020 Elsevier Inc. All rights reserved.)
- Published
- 2020
- Full Text
- View/download PDF
6. Phylogenomics of 10,575 genomes reveals evolutionary proximity between domains Bacteria and Archaea.
- Author
-
Zhu Q, Mai U, Pfeiffer W, Janssen S, Asnicar F, Sanders JG, Belda-Ferre P, Al-Ghalith GA, Kopylova E, McDonald D, Kosciolek T, Yin JB, Huang S, Salam N, Jiao JY, Wu Z, Xu ZZ, Cantrell K, Yang Y, Sayyari E, Rabiee M, Morton JT, Podell S, Knights D, Li WJ, Huttenhower C, Segata N, Smarr L, Mirarab S, and Knight R
- Subjects
- Archaea genetics, Bacteria genetics, Archaea classification, Bacteria classification, Evolution, Molecular, Genome, Archaeal, Genome, Bacterial, Phylogeny
- Abstract
Rapid growth of genome data provides opportunities for updating microbial evolutionary relationships, but this is challenged by the discordant evolution of individual genes. Here we build a reference phylogeny of 10,575 evenly-sampled bacterial and archaeal genomes, based on a comprehensive set of 381 markers, using multiple strategies. Our trees indicate remarkably closer evolutionary proximity between Archaea and Bacteria than previous estimates that were limited to fewer "core" genes, such as the ribosomal proteins. The robustness of the results was tested with respect to several variables, including taxon and site sampling, amino acid substitution heterogeneity and saturation, non-vertical evolution, and the impact of exclusion of candidate phyla radiation (CPR) taxa. Our results provide an updated view of domain-level relationships.
- Published
- 2019
- Full Text
- View/download PDF
7. TADA: phylogenetic augmentation of microbiome samples enhances phenotype classification.
- Author
-
Sayyari E, Kawas B, and Mirarab S
- Subjects
- Machine Learning, Phenotype, Algorithms, Microbiota, Phylogeny
- Abstract
Motivation: Learning associations of traits with the microbial composition of a set of samples is a fundamental goal in microbiome studies. Recently, machine learning methods have been explored for this goal, with some promise. However, in comparison to other fields, microbiome data are high-dimensional and not abundant; leading to a high-dimensional low-sample-size under-determined system. Moreover, microbiome data are often unbalanced and biased. Given such training data, machine learning methods often fail to perform a classification task with sufficient accuracy. Lack of signal is especially problematic when classes are represented in an unbalanced way in the training data; with some classes under-represented. The presence of inter-correlations among subsets of observations further compounds these issues. As a result, machine learning methods have had only limited success in predicting many traits from microbiome. Data augmentation consists of building synthetic samples and adding them to the training data and is a technique that has proved helpful for many machine learning tasks., Results: In this paper, we propose a new data augmentation technique for classifying phenotypes based on the microbiome. Our algorithm, called TADA, uses available data and a statistical generative model to create new samples augmenting existing ones, addressing issues of low-sample-size. In generating new samples, TADA takes into account phylogenetic relationships between microbial species. On two real datasets, we show that adding these synthetic samples to the training set improves the accuracy of downstream classification, especially when the training data have an unbalanced representation of classes., Availability and Implementation: TADA is available at https://github.com/tada-alg/TADA., Supplementary Information: Supplementary data are available at Bioinformatics online., (© The Author(s) 2019. Published by Oxford University Press.)
- Published
- 2019
- Full Text
- View/download PDF
8. Multi-allele species reconstruction using ASTRAL.
- Author
-
Rabiee M, Sayyari E, and Mirarab S
- Subjects
- Algorithms, Computer Simulation, Databases, Genetic, Species Specificity, Alleles, Genomics methods, Phylogeny
- Abstract
Genome-wide phylogeny reconstruction is becoming increasingly common, and one driving factor behind these phylogenomic studies is the promise that the potential discordance between gene trees and the species tree can be modeled. Incomplete lineage sorting is one cause of discordance that bridges population genetic and phylogenetic processes. ASTRAL is a species tree reconstruction method that seeks to find the tree with minimum quartet distance to an input set of inferred gene trees. However, the published ASTRAL algorithm only works with one sample per species. To account for polymorphisms in present-day species, one can sample multiple individuals per species to create multi-allele datasets. Here, we introduce how ASTRAL can handle multi-allele datasets. We show that the quartet-based optimization problem extends naturally, and we introduce heuristic methods for building the search space specifically for the case of multi-individual datasets. We study the accuracy and scalability of the multi-individual version of ASTRAL-III using extensive simulation studies and compare it to NJst, the only other scalable method that can handle these datasets. We do not find strong evidence that using multiple individuals dramatically improves accuracy. When we study the trade-off between sampling more genes versus more individuals, we find that sampling more genes is more effective than sampling more individuals, even under conditions that we study where trees are shallow (median length: ≈1N
e ) and ILS is extremely high., (Published by Elsevier Inc.)- Published
- 2019
- Full Text
- View/download PDF
9. ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees.
- Author
-
Zhang C, Rabiee M, Sayyari E, and Mirarab S
- Subjects
- Animals, Birds classification, Birds genetics, Computer Simulation, Databases, Genetic, Models, Genetic, Species Specificity, Time Factors, Algorithms, Phylogeny
- Abstract
Background: Evolutionary histories can be discordant across the genome, and such discordances need to be considered in reconstructing the species phylogeny. ASTRAL is one of the leading methods for inferring species trees from gene trees while accounting for gene tree discordance. ASTRAL uses dynamic programming to search for the tree that shares the maximum number of quartet topologies with input gene trees, restricting itself to a predefined set of bipartitions., Results: We introduce ASTRAL-III, which substantially improves the running time of ASTRAL-II and guarantees polynomial running time as a function of both the number of species (n) and the number of genes (k). ASTRAL-III limits the bipartition constraint set (X) to grow at most linearly with n and k. Moreover, it handles polytomies more efficiently than ASTRAL-II, exploits similarities between gene trees better, and uses several techniques to avoid searching parts of the search space that are mathematically guaranteed not to include the optimal tree. The asymptotic running time of ASTRAL-III in the presence of polytomies is [Formula: see text] where D=O(nk) is the sum of degrees of all unique nodes in input trees. The running time improvements enable us to test whether contracting low support branches in gene trees improves the accuracy by reducing noise. In extensive simulations, we show that removing branches with very low support (e.g., below 10%) improves accuracy while overly aggressive filtering is harmful. We observe on a biological avian phylogenomic dataset of 14K genes that contracting low support branches greatly improve results., Conclusions: ASTRAL-III is a faster version of the ASTRAL method for phylogenetic reconstruction and can scale up to 10,000 species. With ASTRAL-III, low support branches can be removed, resulting in improved accuracy.
- Published
- 2018
- Full Text
- View/download PDF
10. DiscoVista: Interpretable visualizations of gene tree discordance.
- Author
-
Sayyari E, Whitfield JB, and Mirarab S
- Subjects
- Genome, Models, Genetic, Phylogeny, Classification methods, Software
- Abstract
Phylogenomics has ushered in an age of discordance. Analyses often reveal abundant discordances among phylogenies of different parts of genomes, as well as incongruences between species trees obtained using different methods or data partitions. Researchers are often left trying to make sense of such incongruences. Interpretive ways of measuring and visualizing discordance are needed, both among alternative species trees and gene trees, especially for specific focal branches of a tree. Here, we introduce DiscoVista, a publicly available tool that creates a suite of simple but interpretable visualizations. DiscoVista helps quantify the amount of discordance and some of its potential causes., (Published by Elsevier Inc.)
- Published
- 2018
- Full Text
- View/download PDF
11. Testing for Polytomies in Phylogenetic Species Trees Using Quartet Frequencies.
- Author
-
Sayyari E and Mirarab S
- Abstract
Phylogenetic species trees typically represent the speciation history as a bifurcating tree. Speciation events that simultaneously create more than two descendants, thereby creating polytomies in the phylogeny, are possible. Moreover, the inability to resolve relationships is often shown as a (soft) polytomy. Both types of polytomies have been traditionally studied in the context of gene tree reconstruction from sequence data. However, polytomies in the species tree cannot be detected or ruled out without considering gene tree discordance. In this paper, we describe a statistical test based on properties of the multi-species coalescent model to test the null hypothesis that a branch in an estimated species tree should be replaced by a polytomy. On both simulated and biological datasets, we show that the null hypothesis is rejected for all but the shortest branches, and in most cases, it is retained for true polytomies. The test, available as part of the Accurate Species TRee ALgorithm (ASTRAL) package, can help systematists decide whether their datasets are sufficient to resolve specific relationships of interest.
- Published
- 2018
- Full Text
- View/download PDF
12. Fragmentary Gene Sequences Negatively Impact Gene Tree and Species Tree Reconstruction.
- Author
-
Sayyari E, Whitfield JB, and Mirarab S
- Subjects
- Algorithms, Animals, Computer Simulation, Genetic Speciation, Genome, Insecta genetics, Models, Genetic, Peptide Fragments genetics, Genomics methods, Phylogeny, Sequence Analysis, Protein methods
- Abstract
Species tree reconstruction from genome-wide data is increasingly being attempted, in most cases using a two-step approach of first estimating individual gene trees and then summarizing them to obtain a species tree. The accuracy of this approach, which promises to account for gene tree discordance, depends on the quality of the inferred gene trees. At the same time, phylogenomic and phylotranscriptomic analyses typically use involved bioinformatics pipelines for data preparation. Errors and shortcomings resulting from these preprocessing steps may impact the species tree analyses at the other end of the pipeline. In this article, we first show that the presence of fragmentary data for some species in a gene alignment, as often seen on real data, can result in substantial deterioration of gene trees, and as a result, the species tree. We then investigate a simple filtering strategy where individual fragmentary sequences are removed from individual genes but the rest of the gene is retained. Both in simulations and by reanalyzing a large insect phylotranscriptomic data set, we show the effectiveness of this simple filtering strategy., (© The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.)
- Published
- 2017
- Full Text
- View/download PDF
13. Minimum variance rooting of phylogenetic trees and implications for species tree reconstruction.
- Author
-
Mai U, Sayyari E, and Mirarab S
- Subjects
- Computer Simulation, Databases as Topic, Genes, Species Specificity, Algorithms, Phylogeny
- Abstract
Phylogenetic trees inferred using commonly-used models of sequence evolution are unrooted, but the root position matters both for interpretation and downstream applications. This issue has been long recognized; however, whether the potential for discordance between the species tree and gene trees impacts methods of rooting a phylogenetic tree has not been extensively studied. In this paper, we introduce a new method of rooting a tree based on its branch length distribution; our method, which minimizes the variance of root to tip distances, is inspired by the traditional midpoint rerooting and is justified when deviations from the strict molecular clock are random. Like midpoint rerooting, the method can be implemented in a linear time algorithm. In extensive simulations that consider discordance between gene trees and the species tree, we show that the new method is more accurate than midpoint rerooting, but its relative accuracy compared to using outgroups to root gene trees depends on the size of the dataset and levels of deviations from the strict clock. We show high levels of error for all methods of rooting estimated gene trees due to factors that include effects of gene tree discordance, deviations from the clock, and gene tree estimation error. Our simulations, however, did not reveal significant differences between two equivalent methods for species tree estimation that use rooted and unrooted input, namely, STAR and NJst. Nevertheless, our results point to limitations of existing scalable rooting methods.
- Published
- 2017
- Full Text
- View/download PDF
14. Anchoring quartet-based phylogenetic distances and applications to species tree reconstruction.
- Author
-
Sayyari E and Mirarab S
- Subjects
- Animals, Birds classification, Birds genetics, Databases, Genetic, Mammals classification, Mammals genetics, Models, Genetic, Phylogeny, Algorithms
- Abstract
Background: Inferring species trees from gene trees using the coalescent-based summary methods has been the subject of much attention, yet new scalable and accurate methods are needed., Results: We introduce DISTIQUE, a new statistically consistent summary method for inferring species trees from gene trees under the coalescent model. We generalize our results to arbitrary phylogenetic inference problems; we show that two arbitrarily chosen leaves, called anchors, can be used to estimate relative distances between all other pairs of leaves by inferring relevant quartet trees. This results in a family of distance-based tree inference methods, with running times ranging between quadratic to quartic in the number of leaves., Conclusions: We show in simulated studies that DISTIQUE has comparable accuracy to leading coalescent-based summary methods and reduced running times.
- Published
- 2016
- Full Text
- View/download PDF
15. Fast Coalescent-Based Computation of Local Branch Support from Quartet Frequencies.
- Author
-
Sayyari E and Mirarab S
- Subjects
- Algorithms, Bayes Theorem, Computer Simulation, Genetic Speciation, Phylogeny, Probability, Computational Biology methods, Genomics methods, Models, Genetic
- Abstract
Species tree reconstruction is complicated by effects of incomplete lineage sorting, commonly modeled by the multi-species coalescent model (MSC). While there has been substantial progress in developing methods that estimate a species tree given a collection of gene trees, less attention has been paid to fast and accurate methods of quantifying support. In this article, we propose a fast algorithm to compute quartet-based support for each branch of a given species tree with regard to a given set of gene trees. We then show how the quartet support can be used in the context of the MSC to compute (1) the local posterior probability (PP) that the branch is in the species tree and (2) the length of the branch in coalescent units. We evaluate the precision and recall of the local PP on a wide set of simulated and biological datasets, and show that it has very high precision and improved recall compared with multi-locus bootstrapping. The estimated branch lengths are highly accurate when gene tree estimation error is low, but are underestimated when gene tree estimation error increases. Computation of both the branch length and local PP is implemented as new features in ASTRAL., (© The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.)
- Published
- 2016
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.