Back to Search Start Over

A disease inference method based on symptom extraction and bidirectional Long Short Term Memory networks

Authors :
Ying Yu
Fang-Xiang Wu
Donglin Guo
Guihua Duan
Yaohang Li
Min Li
Source :
Methods (San Diego, Calif.). 173
Publication Year :
2019

Abstract

The wide applications of automatic disease inference in many medical fields improve the efficiency of medical treatments. Many efforts have been made to predict patients' future health conditions according to their full clinical texts, clinical measurements or medical codes. Symptoms reflect the onset of diseases and can provide credible information for disease diagnosis. In this study, we propose a new disease inference method by extracting symptoms and integrating two symptom representation approaches. To reduce the uncertainty and irregularity of symptom descriptions in Electronic Medical Records (EMR), a comprehensive clinical knowledge database consisting of massive amount of data about diseases, symptoms, and their relationships, we extract symptoms with existing nature language process tool Metamap which is designed for biomedical texts. To take advantages of the complex relationship between symptoms and diseases to enhance the accuracy of disease inference, we present two symptom representation models: term frequency-inverse document frequency (TF-IDF) model for the representation of the relationship between symptoms and diseases and Word2Vec for the expression of the semantic relationship between symptoms. Based on these two symptom representations, we employ the bidirectional Long Short Term Memory networks (BiLSTMs) to model symptom sequences in EMR. Our proposed model shows a significant improvement in term of AUC (0.895) and F1 (0.572) for 50 diseases in MIMIC-III dataset. The results illustrate that the model with the combination of the two symptom representations perform better than the one with only one of them.

Details

ISSN :
10959130
Volume :
173
Database :
OpenAIRE
Journal :
Methods (San Diego, Calif.)
Accession number :
edsair.doi.dedup.....4eb23239a9667a5102c10254133156a7