Back to Search Start Over

Measuring the impact of gene prediction on gene loss estimates in Eukaryotes by quantifying falsely inferred absences.

Authors :
Deutekom ES
Vosseberg J
van Dam TJP
Snel B
Source :
PLoS computational biology [PLoS Comput Biol] 2019 Aug 28; Vol. 15 (8), pp. e1007301. Date of Electronic Publication: 2019 Aug 28 (Print Publication: 2019).
Publication Year :
2019

Abstract

In recent years it became clear that in eukaryotic genome evolution gene loss is prevalent over gene gain. However, the absence of genes in an annotated genome is not always equivalent to the loss of genes. Due to sequencing issues, or incorrect gene prediction, genes can be falsely inferred as absent. This implies that loss estimates are overestimated and, more generally, that falsely inferred absences impact genomic comparative studies. However, reliable estimates of how prevalent this issue is are lacking. Here we quantified the impact of gene prediction on gene loss estimates in eukaryotes by analysing 209 phylogenetically diverse eukaryotic organisms and comparing their predicted proteomes to that of their respective six-frame translated genomes. We observe that 4.61% of domains per species were falsely inferred to be absent for Pfam domains predicted to have been present in the last eukaryotic common ancestor. Between phylogenetically different categories this estimate varies substantially: for clade-specific loss (ancestral loss) we found 1.30% and for species-specific loss 16.88% to be falsely inferred as absent. For BUSCO 1-to-1 orthologous families, 18.30% were falsely inferred to be absent. Finally, we showed that falsely inferred absences indeed impact loss estimates, with the number of losses decreasing by 11.78%. Our work strengthens the increasing number of studies showing that gene loss is an important factor in eukaryotic genome evolution. However, while we demonstrate that on average inferring gene absences from predicted proteomes is reliable, caution is warranted when inferring species-specific absences.<br />Competing Interests: The authors have declared that no competing interests exist.

Details

Language :
English
ISSN :
1553-7358
Volume :
15
Issue :
8
Database :
MEDLINE
Journal :
PLoS computational biology
Publication Type :
Academic Journal
Accession number :
31461468
Full Text :
https://doi.org/10.1371/journal.pcbi.1007301