Back to Search Start Over

The effect of sample size on polygenic hazard models for prostate cancer.

Authors :
Karunamuni RA
Huynh-Le MP
Fan CC
Eeles RA
Easton DF
Kote-Jarai Z
Amin Al Olama A
Benlloch Garcia S
Muir K
Gronberg H
Wiklund F
Aly M
Schleutker J
Sipeky C
Tammela TLJ
Nordestgaard BG
Key TJ
Travis RC
Neal DE
Donovan JL
Hamdy FC
Pharoah P
Pashayan N
Khaw KT
Thibodeau SN
McDonnell SK
Schaid DJ
Maier C
Vogel W
Luedeke M
Herkommer K
Kibel AS
Cybulski C
Wokolorczyk D
Kluzniak W
Cannon-Albright L
Brenner H
Schöttker B
Holleczek B
Park JY
Sellers TA
Lin HY
Slavov C
Kaneva R
Mitev V
Batra J
Clements JA
Spurdle A
Teixeira MR
Paulo P
Maia S
Pandha H
Michael A
Mills IG
Andreassen OA
Dale AM
Seibert TM
Source :
European journal of human genetics : EJHG [Eur J Hum Genet] 2020 Oct; Vol. 28 (10), pp. 1467-1475. Date of Electronic Publication: 2020 Jun 08.
Publication Year :
2020

Abstract

We determined the effect of sample size on performance of polygenic hazard score (PHS) models in prostate cancer. Age and genotypes were obtained for 40,861 men from the PRACTICAL consortium. The dataset included 201,590 SNPs per subject, and was split into training and testing sets. Established-SNP models considered 65 SNPs that had been previously associated with prostate cancer. Discovery-SNP models used stepwise selection to identify new SNPs. The performance of each PHS model was calculated for random sizes of the training set. The performance of a representative Established-SNP model was estimated for random sizes of the testing set. Mean HR <subscript>98/50</subscript> (hazard ratio of top 2% to average in test set) of the Established-SNP model increased from 1.73 [95% CI: 1.69-1.77] to 2.41 [2.40-2.43] when the number of training samples was increased from 1 thousand to 30 thousand. Corresponding HR <subscript>98/50</subscript> of the Discovery-SNP model increased from 1.05 [0.93-1.18] to 2.19 [2.16-2.23]. HR <subscript>98/50</subscript> of a representative Established-SNP model using testing set sample sizes of 0.6 thousand and 6 thousand observations were 1.78 [1.70-1.85] and 1.73 [1.71-1.76], respectively. We estimate that a study population of 20 thousand men is required to develop Discovery-SNP PHS models while 10 thousand men should be sufficient for Established-SNP models.

Details

Language :
English
ISSN :
1476-5438
Volume :
28
Issue :
10
Database :
MEDLINE
Journal :
European journal of human genetics : EJHG
Publication Type :
Academic Journal
Accession number :
32514134
Full Text :
https://doi.org/10.1038/s41431-020-0664-2