Back to Search
Start Over
ATLAS: an automated association test using probabilistically linked health records with application to genetic studies
- Source :
- Journal of the American Medical Informatics Association, Journal of the American Medical Informatics Association, 2021, 28 (12), pp.2582-2592. ⟨10.1093/jamia/ocab187⟩, Journal of the American Medical Informatics Association, BMJ Publishing Group, 2021, 28 (12), pp.2582-2592. ⟨10.1093/jamia/ocab187⟩, J Am Med Inform Assoc
- Publication Year :
- 2021
- Publisher :
- HAL CCSD, 2021.
-
Abstract
- Objective Large amounts of health data are becoming available for biomedical research. Synthesizing information across databases may capture more comprehensive pictures of patient health and enable novel research studies. When no gold standard mappings between patient records are available, researchers may probabilistically link records from separate databases and analyze the linked data. However, previous linked data inference methods are constrained to certain linkage settings and exhibit low power. Here, we present ATLAS, an automated, flexible, and robust association testing algorithm for probabilistically linked data. Materials and Methods Missing variables are imputed at various thresholds using a weighted average method that propagates uncertainty from probabilistic linkage. Next, estimated effect sizes are obtained using a generalized linear model. ATLAS then conducts the threshold combination test by optimally combining P values obtained from data imputed at varying thresholds using Fisher’s method and perturbation resampling. Results In simulations, ATLAS controls for type I error and exhibits high power compared to previous methods. In a real-world genetic association study, meta-analysis of ATLAS-enabled analyses on a linked cohort with analyses using an existing cohort yielded additional significant associations between rheumatoid arthritis genetic risk score and laboratory biomarkers. Discussion Weighted average imputation weathers false matches and increases contribution of true matches to mitigate linkage error-induced bias. The threshold combination test avoids arbitrarily choosing a threshold to rule a match, thus automating linked data-enabled analyses and preserving power. Conclusion ATLAS promises to enable novel and powerful research studies using linked data to capitalize on all available data sources.
- Subjects :
- Databases, Factual
Computer science
genetic association studies
Inference
perturbation resampling
Health Informatics
computer.software_genre
Research and Applications
03 medical and health sciences
0302 clinical medicine
Bias
[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST]
Resampling
Humans
030212 general & internal medicine
Imputation (statistics)
[MATH.MATH-ST] Mathematics [math]/Statistics [math.ST]
030304 developmental biology
Linkage (software)
0303 health sciences
[SDV.MHEP] Life Sciences [q-bio]/Human health and pathology
Diagnostic Tests, Routine
Probabilistic logic
Linked data
electronic health records
biorepositories
record linkage
Data mining
Medical Record Linkage
computer
Record linkage
Algorithms
[SDV.MHEP]Life Sciences [q-bio]/Human health and pathology
Type I and type II errors
Subjects
Details
- Language :
- English
- ISSN :
- 10675027 and 1527974X
- Database :
- OpenAIRE
- Journal :
- Journal of the American Medical Informatics Association, Journal of the American Medical Informatics Association, 2021, 28 (12), pp.2582-2592. ⟨10.1093/jamia/ocab187⟩, Journal of the American Medical Informatics Association, BMJ Publishing Group, 2021, 28 (12), pp.2582-2592. ⟨10.1093/jamia/ocab187⟩, J Am Med Inform Assoc
- Accession number :
- edsair.doi.dedup.....204540290983aed2d0525c777d67740e
- Full Text :
- https://doi.org/10.1093/jamia/ocab187⟩