Back to Search Start Over

Conflict over the Eukaryote Root Resides in Strong Outliers, Mosaics and Missing Data Sensitivity of Site-Specific (CAT) Mixture Models.

Authors :
Al Jewari C
Baldauf SL
Source :
Systematic biology [Syst Biol] 2023 May 19; Vol. 72 (1), pp. 1-16.
Publication Year :
2023

Abstract

Phylogenetic reconstruction using concatenated loci ("phylogenomics" or "supermatrix phylogeny") is a powerful tool for solving evolutionary splits that are poorly resolved in single gene/protein trees. However, recent phylogenomic attempts to resolve the eukaryote root have yielded conflicting results, along with claims of various artifacts hidden in the data. We have investigated these conflicts using two new methods for assessing phylogenetic conflict. ConJak uses whole marker (gene or protein) jackknifing to assess deviation from a central mean for each individual sequence, whereas ConWin uses a sliding window to screen for incongruent protein fragments (mosaics). Both methods allow selective masking of individual sequences or sequence fragments in order to minimize missing data, an important consideration for resolving deep splits with limited data. Analyses focused on a set of 76 eukaryotic proteins of bacterial ancestry previously used in various combinations to assess the branching order among the three major divisions of eukaryotes: Amorphea (mainly animals, fungi, and Amoebozoa), Diaphoretickes (most other well-known eukaryotes and nearly all algae) and Excavata, represented here by Discoba (Jakobida, Heterolobosea, and Euglenozoa). ConJak analyses found strong outliers to be concentrated in undersampled lineages, whereas ConWin analyses of Discoba, the most undersampled of the major lineages, detected potentially incongruent fragments scattered throughout. Phylogenetic analyses of the full data using an LG-gamma model support a Discoba sister scenario (neozoan-excavate root), which rises to 99-100% bootstrap support with data masked according to either protocol. However, analyses with two site-specific (CAT) mixture models yielded widely inconsistent results and a striking sensitivity to missing data. The neozoan-excavate root places Amorphea and Diaphoretickes as more closely related to each other than either is to Discoba, a fundamental relationship that should remain unaffected by additional taxa. [CAT-GTR; Discoba; eukaryote tree of life; HGT; jackknife; mixture models; mosaic genes; phylogenomics; sliding window; supermatrix.].<br /> (© The Author(s) 2022. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.)

Details

Language :
English
ISSN :
1076-836X
Volume :
72
Issue :
1
Database :
MEDLINE
Journal :
Systematic biology
Publication Type :
Academic Journal
Accession number :
35412616
Full Text :
https://doi.org/10.1093/sysbio/syac029