Back to Search
Start Over
A Protocol to Extract a Specific Genomic Region from a Public Whole-Genome Database and Modify Analytical Bin Length for Population Genetic Studies
- Source :
- Methods and Protocols, Vol 7, Iss 4, p 57 (2024)
- Publication Year :
- 2024
- Publisher :
- MDPI AG, 2024.
-
Abstract
- With the advent of “next-generation” sequencing and the continuous reduction in sequencing costs, an increasing amount of genomic data has emerged, such as whole-genome, whole-exome, and targeted sequencing data. These applications are popular not only in mega sequencing projects, such as the 1000 Genomes Project and UK BioBank, but also among individual researchers. Evolutionary genetic analyses, such as the dN/dS ratio and Tajima’s D, are demanded more and more for whole-genome-level population data. These analyses are often carried out under a uniform custom bin size across the genome. However, these analyses require subdivision of a genomic region into functional units, such as protein-coding regions, introns, and untranslated regions, and computing these genetic measures for large-scale data remains challenging. In a recent investigation, we successfully devised a method to address this issue. This method requires a multi-sample VCF file containing population data, a reference genome, target regions in the BED file, and a list of samples to be included in the analysis. Given that the targeted regions are extracted in a new VCF file, targeted population genetic analysis can be performed. We conducted Tajima’s D analysis using this approach on intact and pseudogenes, as well as non-coding regions.
Details
- Language :
- English
- ISSN :
- 24099279
- Volume :
- 7
- Issue :
- 4
- Database :
- Directory of Open Access Journals
- Journal :
- Methods and Protocols
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.8a1b65b8efd4c7b9eb207a690d416e3
- Document Type :
- article
- Full Text :
- https://doi.org/10.3390/mps7040057