Back to Search Start Over

Automated SNOMED CT concept and attribute relationship detection through a web-based implementation of cTAKES

Authors :
Derk L. Arts
Francis Lau
Martijn G. Kersloot
Ronald Cornet
Ameen Abu-Hanna
Graduate School
Medical Informatics
APH - Methodology
APH - Aging & Later Life
General practice
APH - Quality of Care
APH - Digital Health
APH - Global Health
Source :
Journal of Biomedical Semantics, Vol 10, Iss 1, Pp 1-13 (2019), Journal of Biomedical Semantics, Journal of Biomedical Semantics, 10(1):14. BioMed Central Ltd.
Publication Year :
2019

Abstract

Background Information in Electronic Health Records is largely stored as unstructured free text. Natural language processing (NLP), or Medical Language Processing (MLP) in medicine, aims at extracting structured information from free text, and is less expensive and time-consuming than manual extraction. However, most algorithms in MLP are institution-specific or address only one clinical need, and thus cannot be broadly applied. In addition, most MLP systems do not detect concepts in misspelled text and cannot detect attribute relationships between concepts. The objective of this study was to develop and evaluate an MLP application that includes generic algorithms for the detection of (misspelled) concepts and of attribute relationships between them. Methods An implementation of the MLP system cTAKES, called DIRECT, was developed with generic SNOMED CT concept filter, concept relationship detection, and attribute relationship detection algorithms and a custom dictionary. Four implementations of cTAKES were evaluated by comparing 98 manually annotated oncology charts with the output of DIRECT. The F1-score was determined for named-entity recognition and attribute relationship detection for the concepts ‘lung cancer’, ‘non-small cell lung cancer’, and ‘recurrence’. The performance of the four implementations was compared with a two-tailed permutation test. Results DIRECT detected lung cancer and non-small cell lung cancer concepts with F1-scores between 0.828 and 0.947 and between 0.862 and 0.933, respectively. The concept recurrence was detected with a significantly higher F1-score of 0.921, compared to the other implementations, and the relationship between recurrence and lung cancer with an F1-score of 0.857. The precision of the detection of lung cancer, non-small cell lung cancer, and recurrence concepts were 1.000, 0.966, and 0.879, compared to precisions of 0.943, 0.967, and 0.000 in the original implementation, respectively. Conclusion DIRECT can detect oncology concepts and attribute relationships with high precision and can detect recurrence with significant increase in F1-score, compared to the original implementation of cTAKES, due to the usage of a custom dictionary and a generic concept relationship detection algorithm. These concepts and relationships can be used to encode clinical narratives, and can thus substantially reduce manual chart abstraction efforts, saving time for clinicians and researchers.

Details

Language :
English
ISSN :
20411480
Volume :
10
Issue :
1
Database :
OpenAIRE
Journal :
Journal of Biomedical Semantics
Accession number :
edsair.doi.dedup.....7c56a87f57cc2eb635718d9f647fdc18