1. Do pseudogenes pose a problem for metabarcoding marine animal communities?
- Author
-
Jessica A. Schultz and Paul D. N. Hebert
- Subjects
Electron Transport Complex IV ,Codon, Terminator ,Genetics ,Animals ,DNA, Environmental ,DNA, Mitochondrial ,Phylogeny ,Pseudogenes ,Ecology, Evolution, Behavior and Systematics ,Biotechnology - Abstract
Because DNA metabarcoding typically employs sequence diversity among mitochondrial amplicons to estimate species composition, nuclear mitochondrial pseudogenes (NUMTs) can inflate diversity. This study quantifies the incidence and attributes of NUMTs derived from the 658-bp barcode region of cytochrome c oxidase I (COI) in 156 marine animal genomes. NUMTs were examined to ascertain if they could be recognized by their possession of indels or stop codons. In total, 309 NUMTs ≥150 bp were detected, with an average of 1.98 per species (range = 0-33) and a mean length of 391 ± 200 bp. Among this total, 75 (24.3%) lacked indels or stop codons. NUMTs appear to pose the greatest interpretational risk when short (313 bp) amplicons are used, such as in environmental DNA studies, dietary analyses or processed fish identification. Employing the standard amplicon length (313 bp) for marine metabarcoding, NUMTs could potentially inflate the operational taxonomic unit (OTU) count by 21% above the true species count while also raising intraspecific variation at COI by 15%. However, when both amplicon length and position are considered, inflation in OTU counts and in barcode variation were just 9% and 10%, respectively, suggesting NUMTs will not seriously distort biodiversity assessments. There was a weak positive correlation between genome size and NUMT count but no variation among phyla or trophic groups. Until bioinformatic advances improve NUMT detection, the best defence involves targeting long amplicons and developing reference databases that include both mitochondrial sequences and their NUMT derivatives.
- Published
- 2022
- Full Text
- View/download PDF