Back to Search
Start Over
Deep neural networks ensemble for detecting medication mentions in tweets
- Source :
- Journal of the American Medical Informatics Association : JAMIA
- Publication Year :
- 2019
-
Abstract
- Objective: After years of research, Twitter posts are now recognized as an important source of patient-generated data, providing unique insights into population health. A fundamental step to incorporating Twitter data in pharmacoepidemiological research is to automatically recognize medication mentions in tweets. Given that lexical searches for medication names may fail due to misspellings or ambiguity with common words, we propose a more advanced method to recognize them. Methods: We present Kusuri, an Ensemble Learning classifier, able to identify tweets mentioning drug products and dietary supplements. Kusuri ("medication" in Japanese) is composed of two modules. First, four different classifiers (lexicon-based, spelling-variant-based, pattern-based and one based on a weakly-trained neural network) are applied in parallel to discover tweets potentially containing medication names. Second, an ensemble of deep neural networks encoding morphological, semantical and long-range dependencies of important words in the tweets discovered is used to make the final decision. Results: On a balanced (50-50) corpus of 15,005 tweets, Kusuri demonstrated performances close to human annotators with 93.7% F1-score, the best score achieved thus far on this corpus. On a corpus made of all tweets posted by 113 Twitter users (98,959 tweets, with only 0.26% mentioning medications), Kusuri obtained 76.3% F1-score. There is not a prior drug extraction system that compares running on such an extremely unbalanced dataset. Conclusion: The system identifies tweets mentioning drug names with performance high enough to ensure its usefulness and ready to be integrated in larger natural language processing systems.<br />This is a pre-copy-editing, author-produced PDF of an article accepted for publication in JAMIA following peer review. The definitive publisher-authenticated version is "D. Weissenbacher, A. Sarker, A. Klein, K. O'Connor, A. Magge, G. Gonzalez-Hernandez, Deep neural networks ensemble for detecting medication mentions in tweets, Journal of the American Medical Informatics Association, ocz156, 2019"
- Subjects :
- FOS: Computer and information sciences
Computer Science - Machine Learning
text classification
020205 medical informatics
Computer science
media_common.quotation_subject
social media
Health Informatics
02 engineering and technology
computer.software_genre
Lexicon
Research and Applications
drug name detection
Machine Learning (cs.LG)
Computer Science - Information Retrieval
03 medical and health sciences
Pharmacovigilance
0302 clinical medicine
Deep Learning
Classifier (linguistics)
0202 electrical engineering, electronic engineering, information engineering
Humans
Social media
030212 general & internal medicine
media_common
Natural Language Processing
Computer Science - Computation and Language
Artificial neural network
business.industry
Ambiguity
Ensemble learning
Spelling
3. Good health
Pharmaceutical Preparations
ensemble learning
Artificial intelligence
Neural Networks, Computer
business
F1 score
Computation and Language (cs.CL)
computer
Information Retrieval (cs.IR)
Natural language processing
Subjects
Details
- ISSN :
- 1527974X
- Volume :
- 26
- Issue :
- 12
- Database :
- OpenAIRE
- Journal :
- Journal of the American Medical Informatics Association : JAMIA
- Accession number :
- edsair.doi.dedup.....05903e1d56216b67c325a1dc3af4daf5