Rui Meng, Hui Wang, Zhikang Si, Xuelin Wang, Zekun Zhao, Haipeng Lu, Yizhan Zheng, Jiaqi Chen, Huan Wang, Jiaqi Hu, Ling Xue, Xiaoming Li, Jian Sun, and Jianhui Wu
Abstract Background The global incidence of nonalcoholic fatty liver disease (NAFLD) is rapidly escalating, positioning it as a principal public health challenge with significant implications for population well-being. Given its status as a cornerstone of China's economic structure, the steel industry employs a substantial workforce, consequently bringing associated health issues under increasing scrutiny. Establishing a risk assessment model for NAFLD within steelworkers aids in disease risk stratification among this demographic, thereby facilitating early intervention measures to protect the health of this significant populace. Methods Use of cross-sectional studies. A total of 3328 steelworkers who underwent occupational health evaluations between January and September 2017 were included in this study. Hepatic steatosis was uniformly diagnosed via abdominal ultrasound. Influential factors were pinpointed using chi-square (χ2) tests and unconditional logistic regression analysis, with model inclusion variables identified by pertinent literature. Assessment models encompassing logistic regression, random forest, and XGBoost were constructed, and their effectiveness was juxtaposed in terms of accuracy, area under the curve (AUC), and F1 score. Subsequently, a scoring system for NAFLD risk was established, premised on the optimal model. Results The findings indicated that sex, overweight, obesity, hyperuricemia, dyslipidemia, occupational dust exposure, and ALT serve as risk factors for NAFLD in steelworkers, with corresponding odds ratios (OR, 95% confidence interval (CI)) of 0.672 (0.487–0.928), 4.971 (3.981–6.207), 16.887 (12.99–21.953), 2.124 (1.77–2.548), 2.315 (1.63–3.288), 1.254 (1.014–1.551), and 3.629 (2.705–4.869), respectively. The sensitivity of the three models was reported as 0.607, 0.680 and 0.564, respectively, while the precision was 0.708, 0.643, and 0.701, respectively. The AUC measurements were 0.839, 0.839, and 0.832, and the Brier scores were 0.150, 0.153, and 0.155, respectively. The F1 score results were 0.654, 0.661, and 0.625, with log loss measures at 0.460, 0.661, and 0.564, respectively. R 2 values were reported as 0.789, 0.771, and 0.778, respectively. Performance was comparable across all three models, with no significant differences observed. The NAFLD risk score system exhibited exceptional risk detection capabilities with an established cutoff value of 86. Conclusions The study identified sex, BMI, dyslipidemia, hyperuricemia, occupational dust exposure, and ALT as significant risk factors for NAFLD among steelworkers. The traditional logistic regression model proved equally effective as the random forest and XGBoost models in assessing NAFLD risk. The optimal cutoff value for risk assessment was determined to be 86. This study provides clinicians with a visually accessible risk stratification approach to gauge the propensity for NAFLD in steelworkers, thereby aiding early identification and intervention among those at risk.