Back to Search
Start Over
A likelihood-based framework for variant calling and de novo mutation detection in families
- Source :
- PLoS Genetics, Vol 8, Iss 10, p e1002944 (2012), PLoS Genetics
- Publication Year :
- 2012
- Publisher :
- Public Library of Science (PLoS), 2012.
-
Abstract
- Family samples, which can be enriched for rare causal variants by focusing on families with multiple extreme individuals and which facilitate detection of de novo mutation events, provide an attractive resource for next-generation sequencing studies. Here, we describe, implement, and evaluate a likelihood-based framework for analysis of next generation sequence data in family samples. Our framework is able to identify variant sites accurately and to assign individual genotypes, and can handle de novo mutation events, increasing the sensitivity and specificity of variant calling and de novo mutation detection. Through simulations we show explicit modeling of family relationships is especially useful for analyses of low-frequency variants and that genotype accuracy increases with the number of individuals sequenced per family. Compared with the standard approach of ignoring relatedness, our methods identify and accurately genotype more variants, and have high specificity for detecting de novo mutation events. The improvement in accuracy using our methods over the standard approach is particularly pronounced for low-frequency variants. Furthermore the family-aware calling framework dramatically reduces Mendelian inconsistencies and is beneficial for family-based analysis. We hope our framework and software will facilitate continuing efforts to identify genetic factors underlying human diseases.<br />Author Summary New sequencing methods can be used to study how genetic variation contributes to disease. For studies of rare variation, family designs are especially attractive because they allow even very rare variants to be observed in multiple individuals and because they can be used to study the impact of de novo mutation events. An important challenge is that most raw sequencing data include many errors. Here, we develop a new approach for interpreting sequence data. We show that by analyzing sequence data across many family members together it is possible to greatly reduce error rates (measured either as the number of true variants that are missed or the number of false variants that are claimed). In addition to facilitating detection and genotyping of SNPs, our methods can interface with existing tools to improve the accuracy of more challenging short insertion deletion polymorphisms and other types of variants. Our methods should make studies of families even more attractive because, in addition to making it easy to study rare variants and de novo mutation events, family studies will now be able to better transform sequence data into accurate genotypes.
- Subjects :
- Cancer Research
Heredity
Genotype
lcsh:QH426-470
DNA Mutational Analysis
Genotypes
Single-nucleotide polymorphism
Genomics
Biology
Polymorphism, Single Nucleotide
symbols.namesake
Gene Frequency
Genetic Mutation
Genetics
Animals
Humans
Computer Simulation
Family
Genome Sequencing
Molecular Biology
Genotyping
Allele frequency
Genetics (clinical)
Ecology, Evolution, Behavior and Systematics
Likelihood Functions
Haplotype
Computational Biology
Human Genetics
Pedigree
lcsh:Genetics
ROC Curve
Genetics of Disease
Mutation (genetic algorithm)
Mendelian inheritance
symbols
Algorithms
Research Article
Subjects
Details
- Language :
- English
- ISSN :
- 15537404 and 15537390
- Volume :
- 8
- Issue :
- 10
- Database :
- OpenAIRE
- Journal :
- PLoS Genetics
- Accession number :
- edsair.doi.dedup.....04d092f3352c4a661339b261e0f146f6