1. Extracting Information from Molecular Pathway Diagrams
- Author
-
Antonio Foncubierta-Rodríguez, Costas Bekas, Anca-Nicoleta Ciubotaru, and Maria Gabrani
- Subjects
0301 basic medicine ,Ground truth ,Flowchart ,Information retrieval ,Computer science ,media_common.quotation_subject ,Systems biology ,020207 software engineering ,02 engineering and technology ,Scientific literature ,Ambiguity ,Automatic summarization ,Synthetic data ,law.invention ,03 medical and health sciences ,030104 developmental biology ,law ,0202 electrical engineering, electronic engineering, information engineering ,Precision and recall ,media_common - Abstract
Health and life sciences’ research fields like personalized medicine, drug discovery, pharmacovigilance and systems biology make an intensive use of graphical information to represent knowledge in the form of domain-specific diagrams, such as molecular pathway‘s. The aim is to provide added value to written text in scientific literature and related documents. Enabling access to all the existing literature for further research requires enabling access to the information contained in these diagrams. Molecular pathways are very different from more conventional diagrams (e.g. flowcharts), and therefore interpretation of molecular pathway diagrams requires domain-specific knowledge to remove ambiguity. In this paper, we propose a method that automatically extracts information from molecular pathways using computer vision techniques. To the best of our knowledge this is the first attempt to retrieve information from images depicting molecular pathway diagrams. The lack of a significant, publicly available dataset with annotated ground truth has led to experimental evaluation on synthetic data. Results show high precision and recall values for the detection of entities and relations. We compare and describe the substantial differences between the proposed method and prior art on the closest diagram type using CLEF-IP flowchart summarization task.
- Published
- 2018
- Full Text
- View/download PDF