1. The DERMACLEAR study: Verification results of a natural language processing system in dermatology
- Author
-
Francisco J. Ortiz de Frutos, Ana M. Giménez‐Arnau, Lluís Puig, Juan F. Silvestre, Esther Serra, Laura Salgado‐Boquete, Vicente García‐Patos, Jose L. L. Estebaranz, Jaime Notario, Ana Martin‐Santiago, Gabriel M. Pontevia, Víctor Martín, Guillermo Guinea, Pau Terradas, and Esteban Daudén
- Subjects
deep learning ,dermatology ,machine learning ,natural language processing ,OMOP CSM ,real world data ,Dermatology ,RL1-803 ,Diseases of the genitourinary system. Urology ,RC870-923 - Abstract
Abstract Background Accurately determining the epidemiology of dermatological diseases such as hidradenitis suppurativa (HS), psoriasis (PsO), chronic urticaria (CU) and/or atopic dermatitis (AD) is challenging due to variations in prevalence and disease severity in the reported literature. Objectives The DERMACLEAR study aims to use natural language processing (NLP) to assess the proportions of patients with HS, PsO, CU and/or AD, and obtain information on patient profiles, patient journeys, and disease and healthcare burden in Spain. Here, the study design and objectives of the DERMACLEAR study are described and the precision of the NLP system used is assessed. Methods This study will retrospectively collect patient information from electronic health records (EHRs) at dermatology departments from seven tertiary hospitals in Spain. The NLP system was developed by IOMED Medical Solutions and was verified internally (IOMED scientific team) and externally (principal investigators of each hospital) to determine its precision in identifying patients with HS, PsO, CU and/or AD. Furthermore, internal verification was performed on other medical variables relevant to the study. Results To date, the DERMACLEAR study has retrospectively collected data from 54,458 patients with HS, PsO, CU and/or AD (HS: 5045; PsO: 32,559; CU: 8397; AD: 12,492). The average precision of the NLP system to identify patients diagnosed with HS, PsO, CU, and/or AD across all hospitals exceeded 95% via external and internal verification. Conclusions Results from the DERMACLEAR study will increase the real‐world evidence of clinical practice, obtaining a large amount of information on patients with the studied diseases. The NLP system used is precise in identifying patients diagnosed with HS, PsO, CU and/or AD, and other medical variables from EHRs, highlighting that it is a valid system to use in the DERMACLEAR study.
- Published
- 2023
- Full Text
- View/download PDF