1. Accounting for haplotype phase uncertainty in linkage disequilibrium estimation
- Author
-
Vessela N. Kristensen, Bettina Kulle, Hege Edvardsen, Arnoldo Frigessi, and Leszek Wojnowski
- Subjects
Linkage disequilibrium ,Genotype ,Epidemiology ,Population ,Validation Studies as Topic ,Polymorphism, Single Nucleotide ,Linkage Disequilibrium ,Gene Frequency ,Expectation–maximization algorithm ,Humans ,Computer Simulation ,education ,Genetics (clinical) ,Genetic association ,Mathematics ,Genetics ,education.field_of_study ,Models, Genetic ,Haplotype ,Computational Biology ,Contrast (statistics) ,Weighting ,Haplotypes ,Haplotype estimation ,Algorithm ,Software - Abstract
The characterization of linkage disequilibrium (LD) is applied in a variety of studies including the identification of molecular determinants of the local recombination rate, the migration and population history of populations, and the role of positive selection in adaptation. LD suffers from the phase uncertainty of the haplotypes used in its calculation, which reflects limitations of the algorithms used for haplotype estimation. We introduce a LD calculation method, which deals with phase uncertainty by weighting all possible haplotype pairs according to their estimated probabilities as evaluated by PHASE. In contrast to the expectation-maximization (EM) algorithm as implemented in the HAPLOVIEWand GENETICS packages, our method considers haplotypes based on the entire genetic information available for the candidate region. We tested the method using simulated and real genotyping data. The results show that, for all practical purposes, the new method is advantageous in comparison with algorithms that calculate LD using only the most probable haplotype or bilocus haplotypes based on the EM algorithm. The new method deals especially well with low LD regions, which contribute strongly to phase uncertainty. Altogether, the method is an attractive alternative to standard LD calculation procedures, including those based on the EM algorithm. We implemented the method in the software suite R, together with an interface to the popular haplotype calculation package PHASE. Genet. Epidemiol. 32: 168–178, 2008. r 2007 Wiley-Liss, Inc.
- Published
- 2008