Back to Search Start Over

Text mining for improved exposure assessment.

Authors :
Larsson, Kristin
Baker, Simon
Silins, Ilona
Guo, Yufan
Stenius, Ulla
Korhonen, Anna
Berglund, Marika
Source :
PLoS ONE; 3/3/2017, Vol. 12 Issue 3, p1-21, 21p
Publication Year :
2017

Abstract

Chemical exposure assessments are based on information collected via different methods, such as biomonitoring, personal monitoring, environmental monitoring and questionnaires. The vast amount of chemical-specific exposure information available from web-based databases, such as PubMed, is undoubtedly a great asset to the scientific community. However, manual retrieval of relevant published information is an extremely time consuming task and overviewing the data is nearly impossible. Here, we present the development of an automatic classifier for chemical exposure information. First, nearly 3700 abstracts were manually annotated by an expert in exposure sciences according to a taxonomy exclusively created for exposure information. Natural Language Processing (NLP) techniques were used to extract semantic and syntactic features relevant to chemical exposure text. Using these features, we trained a supervised machine learning algorithm to automatically classify PubMed abstracts according to the exposure taxonomy. The resulting classifier demonstrates good performance in the intrinsic evaluation. We also show that the classifier improves information retrieval of chemical exposure data compared to keyword-based PubMed searches. Case studies demonstrate that the classifier can be used to assist researchers by facilitating information retrieval and classification, enabling data gap recognition and overviewing available scientific literature using chemical-specific publication profiles. Finally, we identify challenges to be addressed in future development of the system. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
19326203
Volume :
12
Issue :
3
Database :
Complementary Index
Journal :
PLoS ONE
Publication Type :
Academic Journal
Accession number :
121549441
Full Text :
https://doi.org/10.1371/journal.pone.0173132