Back to Search Start Over

Updating splits, lumps, and shuffles: Reconciling GenBank names with standardized avian taxonomies.

Authors :
Hosner, Peter A.
Min Zhao
Kimball, Rebecca T.
Braun, Edward L.
Burleigh, J. Gordon
Source :
Ornithology (Oxford University Press). 10/6/2022, Vol. 139 Issue 4, p1-15. 15p.
Publication Year :
2022

Abstract

Biodiversity research has advanced by testing expectations of ecological and evolutionary hypotheses through the linking of large-scale genetic, distributional, and trait datasets. The rise of molecular systematics over the past 30 years has resulted in a wealth of DNA sequences from around the globe. Yet, advances in molecular systematics also have created taxonomic instability, as new estimates of evolutionary relationships and interpretations of species limits have required widespread scientific name changes. Taxonomic instability, colloquially "splits, lumps, and shuffles," presents logistical challenges to large-scale biodiversity research because (1) the same species or sets of populations may be listed under different names in different data sources, or (2) the same name may apply to different sets of populations representing different taxonomic concepts. Consequently, distributional and trait data are often difficult to link directly to primary DNA sequence data without extensive and time-consuming curation. Here, we present RANT: Reconciliation of Avian NCBI Taxonomy. RANT applies taxonomic reconciliation to standardize avian taxon names in use in NCBI GenBank, a primary source of genetic data, to a widely used and regularly updated avian taxonomy: eBird/Clements. Of 14,341 avian species/subspecies names in GenBank, 11,031 directly matched an eBird/Clements; these link to more than 6 million nucleotide sequences. For the remaining unmatched avian names in GenBank, we used Avibase's system of taxonomic concepts, taxonomic descriptions in Cornell's Birds of the World, and DNA sequence metadata to identify corresponding eBird/Clements names. Reconciled names linked to more than 600,000 nucleotide sequences, ~9% of all avian sequences on GenBank. Nearly 10% of eBird/Clements names had nucleotide sequences listed under 2 or more GenBank names. Our taxonomic reconciliation is a first step towards rigorous and open-source curation of avian GenBank sequences and is available at GitHub, where it can be updated to correspond to future annual eBird/Clements taxonomic updates. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
27324613
Volume :
139
Issue :
4
Database :
Academic Search Index
Journal :
Ornithology (Oxford University Press)
Publication Type :
Academic Journal
Accession number :
159520859
Full Text :
https://doi.org/10.1093/ornithology/ukac045