1. Lipoproteins and metabolites in diagnosing and predicting Alzheimer’s disease using machine learning
- Author
-
Fenglin Wang, Aimin Wang, Yiming Huang, Wenfeng Gao, Yaqi Xu, Wenjing Zhang, Guiya Guo, Wangchen Song, Yujia Kong, Qinghua Wang, Suzhen Wang, and Fuyan Shi
- Subjects
Alzheimer’s disease ,Random forest ,Lasso regression ,CatBoost algorithm ,Nutritional diseases. Deficiency diseases ,RC620-627 - Abstract
Abstract Background Alzheimer’s disease (AD) is a chronic neurodegenerative disorder that poses a substantial economic burden. The Random forest algorithm is effective in predicting AD; however, the key factors influencing AD onset remain unclear. This study aimed to analyze the key lipoprotein and metabolite factors influencing AD onset using machine-learning methods. It provides new insights for researchers and medical personnel to understand AD and provides a reference for the early diagnosis, treatment, and early prevention of AD. Methods A total of 603 participants, including controls and patients with AD with complete lipoprotein and metabolite data from the Alzheimer’s disease Neuroimaging Initiative (ADNI) database between 2005 and 2016, were enrolled. Random forest, Lasso regression, and CatBoost algorithms were employed to rank and filter 213 lipoprotein and metabolite variables. Variables with consistently high importance rankings from any two methods were incorporated into the models. Finally, the variables selected from the three methods, with the participants’ age, sex, and marital status, were used to construct a random forest predictive model. Results Fourteen lipoprotein and metabolite variables were screened using the three methods, and 17 variables were included in the AD prediction model based on age, sex, and marital status of the participants. The optimal random forest modeling was constructed with “mtry” set to 3 and “ntree” set to 300. The model exhibited an accuracy of 71.01%, a sensitivity of 79.59%, a specificity of 65.28%, and an AUC (95%CI) of 0.724 (0.645–0.804). When Mean Decrease Accuracy and Gini were used to rank the proteins, age, phospholipids to total lipids ratio in intermediate-density lipoproteins (IDL_PL_PCT), and creatinine were among the top five variables. Conclusions Age, IDL_PL_PCT, and creatinine levels play crucial roles in AD onset. Regular monitoring of lipoproteins and their metabolites in older individuals is significant for early AD diagnosis and prevention.
- Published
- 2024
- Full Text
- View/download PDF