1. Application of DArT seq derived SNP tags for comparative genome analysis in fishes; An alternative pipeline using sequence data from a non-traditional model species, Macquaria ambigua
- Author
-
Richard P. Duncan, Andrzej Kilian, Ross M. Thompson, Tariq Ezaz, Foyez Shams, Fiona Dyer, and Jason D. Thiem
- Subjects
Evolutionary Genetics ,0106 biological sciences ,Retrotransposon ,Genome browser ,01 natural sciences ,Genome ,Database and Informatics Methods ,Genome Evolution ,Phylogeny ,Comparative Genomic Hybridization ,0303 health sciences ,Multidisciplinary ,Fishes ,Eukaryota ,Genomics ,Osteichthyes ,Vertebrates ,Medicine ,Sequence Analysis ,Research Article ,Multiple Alignment Calculation ,Genome evolution ,Bioinformatics ,Sequence analysis ,Science ,Sequence alignment ,Computational biology ,Biology ,Research and Analysis Methods ,Polymorphism, Single Nucleotide ,010603 evolutionary biology ,Molecular Evolution ,03 medical and health sciences ,Computational Techniques ,Genetics ,Animals ,DNA sequence analysis ,030304 developmental biology ,Comparative genomics ,Evolutionary Biology ,Sticklebacks ,Organisms ,Biology and Life Sciences ,Computational Biology ,Sequence Analysis, DNA ,Comparative Genomics ,Genome Analysis ,Split-Decomposition Method ,Genetics, Population ,Fish ,Sequence Alignment - Abstract
Bi-allelic Single Nucleotide Polymorphism (SNP) markers are widely used in population genetic studies. In most studies, sequences either side of the SNPs remain unused, although these sequences contain information beyond that used in population genetic studies. In this study, we show how these sequence tags either side of a single nucleotide polymorphism can be used for comparative genome analysis. We used DArTseq (Diversity Array Technology) derived SNP data for a non-model Australian native freshwater fish, Macquaria ambigua, to identify genes linked to SNP associated sequence tags, and to discover homologies with evolutionarily conserved genes and genomic regions. We concatenated 6,776 SNP sequence tags to create a hypothetical genome (representing 0.1-0.3% of the actual genome), which we used to find sequence homologies with 12 model fish species using the Ensembl genome browser with stringent filtering parameters. We identified sequence homologies for 17 evolutionarily conserved genes (cd9b, plk2b, rhot1b, sh3pxd2aa, si:ch211-148f13.1, si:dkey-166d12.2, zgc:66447, atp8a2, clvs2, lyst, mkln1, mnd1, piga, pik3ca, plagl2, rnf6, sec63) along with an ancestral evolutionarily conserved syntenic block (euteleostomi Block_210). Our analysis also revealed repetitive sequences covering approximately 12% of the hypothetical genome where DNA transposon, LTR and non-LTR retrotransposons were most abundant. A hierarchical pattern of the number of sequence homologies with phylogenetically close species validated the approach for repeatability. This new approach of using SNP associated sequence tags for comparative genome analysis may provide insight into the genome evolution of non-model species where whole genome sequences are unavailable.
- Published
- 2019