1. Alignment-free inference of hierarchical and reticulate phylogenomic relationships
- Author
-
Xin-Yi Chua, James M. Hogan, Cheong Xin Chan, Guillaume Bernard, Stefan Maetschke, Mark A. Ragan, Yao-ban Chan, and Yingnan Cong
- Subjects
Paper ,Genome evolution ,Computer science ,0206 medical engineering ,Inference ,alignment-free ,02 engineering and technology ,Computational biology ,lateral genetic transfer ,TF–IDF ,Genome ,Evolution, Molecular ,03 medical and health sciences ,Reticulate ,Phylogenetics ,Phylogenomics ,Animals ,Humans ,D2 statistics ,Molecular Biology ,Phylogeny ,030304 developmental biology ,0303 health sciences ,Models, Genetic ,Microbiota ,k-mer ,phylogenomics ,Sequence Analysis, DNA ,Viruses ,Scalability ,Sequence Alignment ,Algorithms ,020602 bioinformatics ,Information Systems - Abstract
We are amidst an ongoing flood of sequence data arising from the application of high-throughput technologies, and a concomitant fundamental revision in our understanding of how genomes evolve individually and within the biosphere. Workflows for phylogenomic inference must accommodate data that are not only much larger than before, but often more error prone and perhaps misassembled, or not assembled in the first place. Moreover, genomes of microbes, viruses and plasmids evolve not only by tree-like descent with modification but also by incorporating stretches of exogenous DNA. Thus, next-generation phylogenomics must address computational scalability while rethinking the nature of orthogroups, the alignment of multiple sequences and the inference and comparison of trees. New phylogenomic workflows have begun to take shape based on so-called alignment-free (AF) approaches. Here, we review the conceptual foundations of AF phylogenetics for the hierarchical (vertical) and reticulate (lateral) components of genome evolution, focusing on methods based on k-mers. We reflect on what seems to be successful, and on where further development is needed.
- Published
- 2017