Back to Search Start Over

Survey of BERT-Base Models for Scientific Text Classification: COVID-19 Case Study

Authors :
Mayara Khadhraoui
Hatem Bellaaj
Mehdi Ben Ammar
Habib Hamam
Mohamed Jmaiel
Source :
Applied Sciences, Vol 12, Iss 6, p 2891 (2022)
Publication Year :
2022
Publisher :
MDPI AG, 2022.

Abstract

On 30 January 2020, the World Health Organization announced a new coronavirus, which later turned out to be very dangerous. Since that date, COVID-19 has spread to become a pandemic that has now affected practically all regions in the world. Since then, many researchers in medicine have contributed to fighting COVID-19. In this context and given the great growth of scientific publications related to this global pandemic, manual text and data retrieval has become a challenging task. To remedy this challenge, we are proposing CovBERT, a pre-trained language model based on the BERT model to automate the literature review process. CovBERT relies on prior training on a large corpus of scientific publications in the biomedical domain and related to COVID-19 to increase its performance on the literature review task. We evaluate CovBERT on the classification of short text based on our scientific dataset of biomedical articles on COVID-19 entitled COV-Dat-20. We demonstrate statistically significant improvements by using BERT.

Details

Language :
English
ISSN :
20763417
Volume :
12
Issue :
6
Database :
Directory of Open Access Journals
Journal :
Applied Sciences
Publication Type :
Academic Journal
Accession number :
edsdoj.1745836d9d094e6aaaf2688a3ff07f8c
Document Type :
article
Full Text :
https://doi.org/10.3390/app12062891