Back to Search Start Over

Development of a novel machine learning model to predict presence of nonalcoholic steatohepatitis.

Authors :
Docherty, Matt
Regnier, Stephane A
Capkun, Gorana
Balp, Maria-Magdalena
Ye, Qin
Janssens, Nico
Tietz, Andreas
Löffler, Jürgen
Cai, Jennifer
Pedrosa, Marcos C
Schattenberg, Jörn M
Source :
Journal of the American Medical Informatics Association; Jun2021, Vol. 28 Issue 6, p1235-1241, 7p, 4 Charts, 2 Graphs
Publication Year :
2021

Abstract

<bold>Objective: </bold>To develop a computer model to predict patients with nonalcoholic steatohepatitis (NASH) using machine learning (ML).<bold>Materials and Methods: </bold>This retrospective study utilized two databases: a) the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) nonalcoholic fatty liver disease (NAFLD) adult database (2004-2009), and b) the Optum® de-identified Electronic Health Record dataset (2007-2018), a real-world dataset representative of common electronic health records in the United States. We developed an ML model to predict NASH, using confirmed NASH and non-NASH based on liver histology results in the NIDDK dataset to train the model.<bold>Results: </bold>Models were trained and tested on NIDDK NAFLD data (704 patients) and the best-performing models evaluated on Optum data (~3,000,000 patients). An eXtreme Gradient Boosting model (XGBoost) consisting of 14 features exhibited high performance as measured by area under the curve (0.82), sensitivity (81%), and precision (81%) in predicting NASH. Slightly reduced performance was observed with an abbreviated feature set of 5 variables (0.79, 80%, 80%, respectively). The full model demonstrated good performance (AUC 0.76) to predict NASH in Optum data.<bold>Discussion: </bold>The proposed model, named NASHmap, is the first ML model developed with confirmed NASH and non-NASH cases as determined through liver biopsy and validated on a large, real-world patient dataset. Both the 14 and 5-feature versions exhibit high performance.<bold>Conclusion: </bold>The NASHmap model is a convenient and high performing tool that could be used to identify patients likely to have NASH in clinical settings, allowing better patient management and optimal allocation of clinical resources. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10675027
Volume :
28
Issue :
6
Database :
Complementary Index
Journal :
Journal of the American Medical Informatics Association
Publication Type :
Academic Journal
Accession number :
150938285
Full Text :
https://doi.org/10.1093/jamia/ocab003