Back to Search Start Over

BERT-TFBS: a novel BERT-based model for predicting transcription factor binding sites by transfer learning.

Authors :
Wang, Kai
Zeng, Xuan
Zhou, Jingwen
Liu, Fei
Luan, Xiaoli
Wang, Xinglong
Source :
Briefings in Bioinformatics; May2024, Vol. 25 Issue 3, p1-11, 11p
Publication Year :
2024

Abstract

Transcription factors (TFs) are proteins essential for regulating genetic transcriptions by binding to transcription factor binding sites (TFBSs) in DNA sequences. Accurate predictions of TFBSs can contribute to the design and construction of metabolic regulatory systems based on TFs. Although various deep-learning algorithms have been developed for predicting TFBSs, the prediction performance needs to be improved. This paper proposes a bidirectional encoder representations from transformers (BERT)-based model, called BERT-TFBS, to predict TFBSs solely based on DNA sequences. The model consists of a pre-trained BERT module (DNABERT-2), a convolutional neural network (CNN) module, a convolutional block attention module (CBAM) and an output module. The BERT-TFBS model utilizes the pre-trained DNABERT-2 module to acquire the complex long-term dependencies in DNA sequences through a transfer learning approach, and applies the CNN module and the CBAM to extract high-order local features. The proposed model is trained and tested based on 165 ENCODE ChIP-seq datasets. We conducted experiments with model variants, cross-cell-line validations and comparisons with other models. The experimental results demonstrate the effectiveness and generalization capability of BERT-TFBS in predicting TFBSs, and they show that the proposed model outperforms other deep-learning models. The source code for BERT-TFBS is available at https://github.com/ZX1998-12/BERT-TFBS. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
14675463
Volume :
25
Issue :
3
Database :
Complementary Index
Journal :
Briefings in Bioinformatics
Publication Type :
Academic Journal
Accession number :
177375820
Full Text :
https://doi.org/10.1093/bib/bbae195