1. Data-driven assessment, contextualisation and implementation of 134 variables in the risk for type 2 diabetes: an analysis of Lifelines, a prospective cohort study in the Netherlands
- Author
-
Chirag J. Patel, Thomas P van der Meer, Bruce H. R. Wolffenbuttel, and Center for Liver, Digestive and Metabolic Diseases (CLDM)
- Subjects
Adult ,Male ,Risk ,Identification ,Waist ,Endocrinology, Diabetes and Metabolism ,Population ,Lasso regression ,030209 endocrinology & metabolism ,Type 2 diabetes ,Prediction models ,Risk Assessment ,Article ,Prediabetic State ,03 medical and health sciences ,0302 clinical medicine ,Diabetes mellitus ,Machine learning ,Internal Medicine ,medicine ,Humans ,Prospective Studies ,030212 general & internal medicine ,Family history ,education ,Prospective cohort study ,Netherlands ,education.field_of_study ,Risk variable-wide association study ,business.industry ,Incidence ,Contextualisation ,Middle Aged ,Data-driven ,medicine.disease ,Prospective ,Diabetes Mellitus, Type 2 ,Hyperglycemia ,Cohort ,Female ,business ,Predictive modelling ,Demography - Abstract
Aims/hypothesis We aimed to assess and contextualise 134 potential risk variables for the development of type 2 diabetes and to determine their applicability in risk prediction. Methods A total of 96,534 people without baseline diabetes (372,007 person-years) from the Dutch Lifelines cohort were included. We used a risk variable-wide association study (RV-WAS) design to independently screen and replicate risk variables for 5-year incidence of type 2 diabetes. For identified variables, we contextualised HRs, calculated correlations and assessed their robustness and unique contribution in different clinical contexts using bootstrapped and cross-validated lasso regression models. We evaluated the change in risk, or ‘HR trajectory’, when sequentially assigning variables to a model. Results We identified 63 risk variables, with novel associations for quality-of-life indicators and non-cardiovascular medications (i.e., proton-pump inhibitors, anti-asthmatics). For continuous variables, the increase of 1 SD of HbA1c, i.e., 3.39 mmol/mol (0.31%), was equivalent in risk to an increase of 0.53 mmol/l of glucose, 19.8 cm of waist circumference, 8.34 kg/m2 of BMI, 0.67 mmol/l of HDL-cholesterol, and 0.14 mmol/l of uric acid. Other variables required an increase of >3 SD, which is not physiologically realistic or a rare occurrence in the population. Though moderately correlated, the inclusion of four variables satiated prediction models. Invasive variables, except for glucose and HbA1c, contributed little compared with non-invasive variables. Glucose, HbA1c and family history of diabetes explained a unique part of disease risk. Adding risk variables to a satiated model can impact the HRs of variables already in the model. Conclusions Many variables show weak or inconsistent associations with the development of type 2 diabetes, and only a handful can reliably explain disease risk. Newly discovered risk variables will yield little over established factors, and existing prediction models can be simplified. A systematic, data-driven approach to identify risk variables for the prediction of type 2 diabetes is necessary for the practice of precision medicine. Graphical abstract
- Published
- 2021