Back to Search Start Over

ECSTRA-INSERM @ CLEF eHealth2016-task 2: ICD10 Code Extraction from Death Certificates

Authors :
Dermouche, Mohamed
Looten, Vincent
Flicoteaux, Rémi
Chevret, Sylvie
VELCIN, Julien
Taright, Namik
Equipe de Recherche en Ingénierie des Connaissances (ERIC)
Université Lumière - Lyon 2 (UL2)
Université Pierre et Marie Curie - Paris 6 (UPMC)
ORS PACA
Biostatistique et épidemiologie clinique
Université Paris Diderot - Paris 7 (UPD7)-Institut National de la Santé et de la Recherche Médicale (INSERM)
Entrepôts, Représentation et Ingénierie des Connaissances (ERIC)
Université Lumière - Lyon 2 (UL2)-Université Claude Bernard Lyon 1 (UCBL)
Université de Lyon-Université de Lyon
Service de médecine interne [Saint-Antoine]
Université Pierre et Marie Curie - Paris 6 (UPMC)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-CHU Saint-Antoine [AP-HP]
Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Sorbonne Université (SU)
Velcin, Julien
CHU Saint-Antoine [AP-HP]
Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)
Source :
Conference and Labs of the Evaluation Forum, Conference and Labs of the Evaluation Forum, Sep 2016, Evora, Portugal
Publication Year :
2016
Publisher :
HAL CCSD, 2016.

Abstract

International audience; This paper describes the participation of ECSTRA-INSERM team at CLEF eHealth 2016, task 2.C. The task involves extracting ICD10 codes from death certificates, mainly described with short plain texts. We cast the task as a machine learning problem involving the prediction of the ICD10 codes (categorical variable) from the raw text transformed into a bag-of-words matrix. We rely on probabilistic topic models that we evaluate against classical classifiers such as SVM and Naive Bayes. We demonstrate the effectiveness of topic models for this task in terms of prediction accuracy and result interpretation.

Details

Language :
English
Database :
OpenAIRE
Journal :
Conference and Labs of the Evaluation Forum, Conference and Labs of the Evaluation Forum, Sep 2016, Evora, Portugal
Accession number :
edsair.dedup.wf.001..084f6333f72199d1eba8564df1b3aac4