Back to Search Start Over

Identifying and classifying goals for scientific knowledge.

Authors :
Boguslav MR
Salem NM
White EK
Leach SM
Hunter LE
Source :
Bioinformatics advances [Bioinform Adv] 2021 Jul 28; Vol. 1 (1), pp. vbab012. Date of Electronic Publication: 2021 Jul 28 (Print Publication: 2021).
Publication Year :
2021

Abstract

Motivation: Science progresses by posing good questions, yet work in biomedical text mining has not focused on them much. We propose a novel idea for biomedical natural language processing: identifying and characterizing the questions stated in the biomedical literature. Formally, the task is to identify and characterize statements of ignorance , statements where scientific knowledge is missing or incomplete. The creation of such technology could have many significant impacts, from the training of PhD students to ranking publications and prioritizing funding based on particular questions of interest. The work presented here is intended as the first step towards these goals.<br />Results: We present a novel ignorance taxonomy driven by the role statements of ignorance play in research, identifying specific goals for future scientific knowledge. Using this taxonomy and reliable annotation guidelines (inter-annotator agreement above 80%), we created a gold standard ignorance corpus of 60 full-text documents from the prenatal nutrition literature with over 10 000 annotations and used it to train classifiers that achieved over 0.80 F1 scores.<br />Availability and Implementation: Corpus and source code freely available for download at https://github.com/UCDenver-ccp/Ignorance-Question-Work. The source code is implemented in Python.<br /> (© The Author(s) 2021. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.)

Details

Language :
English
ISSN :
2635-0041
Volume :
1
Issue :
1
Database :
MEDLINE
Journal :
Bioinformatics advances
Publication Type :
Academic Journal
Accession number :
34661112
Full Text :
https://doi.org/10.1093/bioadv/vbab012