Back to Search Start Over

BERT based natural language processing for triage of adverse drug reaction reports shows close to human-level performance.

Authors :
Erik Bergman
Luise Dürlich
Veronica Arthurson
Anders Sundström
Maria Larsson
Shamima Bhuiyan
Andreas Jakobsson
Gabriel Westman
Source :
PLOS Digital Health, Vol 2, Iss 12, p e0000409 (2023)
Publication Year :
2023
Publisher :
Public Library of Science (PLoS), 2023.

Abstract

Post-marketing reports of suspected adverse drug reactions are important for establishing the safety profile of a medicinal product. However, a high influx of reports poses a challenge for regulatory authorities as a delay in identification of previously unknown adverse drug reactions can potentially be harmful to patients. In this study, we use natural language processing (NLP) to predict whether a report is of serious nature based solely on the free-text fields and adverse event terms in the report, potentially allowing reports mislabelled at time of reporting to be detected and prioritized for assessment. We consider four different NLP models at various levels of complexity, bootstrap their train-validation data split to eliminate random effects in the performance estimates and conduct prospective testing to avoid the risk of data leakage. Using a Swedish BERT based language model, continued language pre-training and final classification training, we achieve close to human-level performance in this task. Model architectures based on less complex technical foundation such as bag-of-words approaches and LSTM neural networks trained with random initiation of weights appear to perform less well, likely due to the lack of robustness that a base of general language training provides.

Details

Language :
English
ISSN :
27673170
Volume :
2
Issue :
12
Database :
Directory of Open Access Journals
Journal :
PLOS Digital Health
Publication Type :
Academic Journal
Accession number :
edsdoj.29bf0eefe23b4f5490310a1f50fb79c3
Document Type :
article
Full Text :
https://doi.org/10.1371/journal.pdig.0000409&type=printable