Back to Search
Start Over
GBStools: A Statistical Method for Estimating Allelic Dropout in Reduced Representation Sequencing Data
- Source :
- PLoS Genetics, Vol 12, Iss 2, p e1005631 (2016), CONICET Digital (CONICET), Consejo Nacional de Investigaciones Científicas y Técnicas, instacron:CONICET, PLoS genetics, vol 12, iss 2, SEDICI (UNLP), Universidad Nacional de La Plata, instacron:UNLP, PLoS Genetics
- Publication Year :
- 2016
- Publisher :
- Public Library of Science, 2016.
-
Abstract
- Reduced representation sequencing methods such as genotyping-by-sequencing (GBS) enable low-cost measurement of genetic variation without the need for a reference genome assembly. These methods are widely used in genetic mapping and population genetics studies, especially with non-model organisms. Variant calling error rates, however, are higher in GBS than in standard sequencing, in particular due to restriction site polymorphisms, and few computational tools exist that specifically model and correct these errors. We developed a statistical method to remove errors caused by restriction site polymorphisms, implemented in the software package GBStools. We evaluated it in several simulated data sets, varying in number of samples, mean coverage and population mutation rate, and in two empirical human data sets (N = 8 and N = 63 samples). In our simulations, GBStools improved genotype accuracy more than commonly used filters such as Hardy-Weinberg equilibrium p-values. GBStools is most effective at removing genotype errors in data sets over 100 samples when coverage is 40X or higher, and the improvement is most pronounced in species with high genomic diversity. We also demonstrate the utility of GBS and GBStools for human population genetic inference in Argentine populations and reveal widely varying individual ancestry proportions and an excess of singletons, consistent with recent population growth.<br />Facultad de Ciencias Naturales y Museo<br />Instituto Multidisciplinario de Biología Celular
- Subjects :
- 0106 biological sciences
0301 basic medicine
Cancer Research
Mutation rate
Heredity
Genotyping Techniques
Statistics as Topic
Test Statistics
Population genetics
01 natural sciences
purl.org/becyt/ford/1 [https]
Mathematical and Statistical Techniques
Genotype
REDUCED REPRESENTATION LIBRARIES
Genome Sequencing
Genetics (clinical)
Genetics
education.field_of_study
High-Throughput Nucleotide Sequencing
Genomics
Single Nucleotide
Genetic Mapping
NGS
Physical Sciences
Statistics (Mathematics)
CIENCIAS NATURALES Y EXACTAS
Research Article
Statistical Distributions
lcsh:QH426-470
Otras Ciencias Biológicas
Population
Variant Genotypes
Computational biology
GBS
Biology
Research and Analysis Methods
Polymorphism, Single Nucleotide
010603 evolutionary biology
GENOTYPE BY SEQUENCING
Ciencias Biológicas
03 medical and health sciences
Genetic variation
Humans
Statistical Methods
Polymorphism
Molecular Biology Techniques
Sequencing Techniques
purl.org/becyt/ford/1.6 [https]
education
Molecular Biology
Ciencias Exactas
Alleles
Ecology, Evolution, Behavior and Systematics
Statistical hypothesis testing
Evolutionary Biology
Population Biology
Human Genome
Haplotype
Biology and Life Sciences
Computational Biology
Genome Analysis
Genomic Libraries
Probability Theory
lcsh:Genetics
Genetics, Population
030104 developmental biology
Haplotypes
Genetic Loci
genetic variation
Generic health relevance
Population Genetics
Mathematics
Software
Developmental Biology
Reference genome
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Journal :
- PLoS Genetics, Vol 12, Iss 2, p e1005631 (2016), CONICET Digital (CONICET), Consejo Nacional de Investigaciones Científicas y Técnicas, instacron:CONICET, PLoS genetics, vol 12, iss 2, SEDICI (UNLP), Universidad Nacional de La Plata, instacron:UNLP, PLoS Genetics
- Accession number :
- edsair.doi.dedup.....2205c71ac5632831495be227969a1142
- Full Text :
- https://doi.org/10.1371/journal.pgen.1005631