Back to Search Start Over

Eigenanalysis of SNP data with an identity by descent interpretation

Authors :
Bruce S. Weir
Xiuwen Zheng
Source :
Theoretical Population Biology. 107:65-76
Publication Year :
2016
Publisher :
Elsevier BV, 2016.

Abstract

Principal component analysis (PCA) is widely used in genome-wide association studies (GWAS), and the principal component axes often represent perpendicular gradients in geographic space. The explanation of PCA results is of major interest for geneticists to understand fundamental demographic parameters. Here, we provide an interpretation of PCA based on relatedness measures, which are described by the probability that sets of genes are identical-by-descent (IBD). An approximately linear transformation between ancestral proportions (AP) of individuals with multiple ancestries and their projections onto the principal components is found.In addition, a new method of eigenanalysis “EIGMIX” is proposed to estimate individual ancestries. EIGMIX is a method of moments with computational efficiency suitable for millions of SNP data, and it is not subject to the assumption of linkage equilibrium. With the assumptions of multiple ancestries and their surrogate ancestral samples, EIGMIX is able to infer ancestral proportions (APs) of individuals. The methods were applied to the SNP data from the HapMap Phase 3 project and the Human Genome Diversity Panel. The APs of individuals inferred by EIGMIX are consistent with the findings of the program ADMIXTURE.In conclusion, EIGMIX can be used to detect population structure and estimate genome-wide ancestral proportions with a relatively high accuracy.

Details

ISSN :
00405809
Volume :
107
Database :
OpenAIRE
Journal :
Theoretical Population Biology
Accession number :
edsair.doi.dedup.....e28e7fcf5fd3ec51ab8dda003b8ec444
Full Text :
https://doi.org/10.1016/j.tpb.2015.09.004