1. Inverse probability weighting is an effective method to address selection bias during the analysis of high dimensional data
- Author
-
Patrick M. Carry, Jill M. Norris, Lauren A. Vanderlinden, Katerina Kechris, Teresa Buckner, Fran Dong, Elizabeth Litkowski, and Timothy Vigers
- Subjects
Clustering high-dimensional data ,Sample selection ,Adolescent ,Epidemiology ,media_common.quotation_subject ,Population ,Sample (statistics) ,Age and sex ,Article ,Cohort Studies ,03 medical and health sciences ,Bias ,Statistics ,Humans ,education ,Selection Bias ,Genetics (clinical) ,Probability ,030304 developmental biology ,Mathematics ,media_common ,Selection bias ,0303 health sciences ,education.field_of_study ,Inverse probability weighting ,030305 genetics & heredity ,Genome-Wide Association Study ,Type I and type II errors - Abstract
Omics studies frequently use samples collected during cohort studies. Conditioning on sample availability can cause selection bias if sample availability is nonrandom. Inverse probability weighting (IPW) is purported to reduce this bias. We evaluated IPW in an epigenome-wide analysis testing the association between DNA methylation (261,435 probes) and age in healthy adolescent subjects (n = 114). We simulated age and sex to be correlated with sample selection and then evaluated four conditions: complete population/no selection bias (all subjects), naïve selection bias (no adjustment), and IPW selection bias (selection bias with IPW adjustment). Assuming the complete population condition represented the "truth," we compared each condition to the complete population condition. Bias or difference in associations between age and methylation was reduced in the IPW condition versus the naïve condition. However, genomic inflation and type 1 error were higher in the IPW condition relative to the naïve condition. Postadjustment using bacon, type 1 error and inflation were similar across all conditions. Power was higher under the IPW condition compared with the naïve condition before and after inflation adjustment. IPW methods can reduce bias in genome-wide analyses. Genomic inflation is a potential concern that can be minimized using methods that adjust for inflation.
- Published
- 2021
- Full Text
- View/download PDF