Back to Search
Start Over
Deep Learning of Sequence Patterns for CCCTC-Binding Factor-Mediated Chromatin Loop Formation
- Source :
- Journal of Computational Biology. 28:133-145
- Publication Year :
- 2021
- Publisher :
- Mary Ann Liebert Inc, 2021.
-
Abstract
- The three-dimensional (3D) organization of the human genome is of crucial importance for gene regulation, and the CCCTC-binding factor (CTCF) plays an important role in chromatin interactions. However, it is still unclear what sequence patterns in addition to CTCF motif pairs determine chromatin loop formation. To discover the underlying sequence patterns, we have developed a deep learning model, called DeepCTCFLoop, to predict whether a chromatin loop can be formed between a pair of convergent or tandem CTCF motifs using only the DNA sequences of the motifs and their flanking regions. Our results suggest that DeepCTCFLoop can accurately distinguish the CTCF motif pairs forming chromatin loops from the ones not forming loops. It significantly outperforms CTCF-MP, a machine learning model based on word2vec and boosted trees, when using DNA sequences only. Furthermore, we show that DNA motifs binding to several transcription factors, including ZNF384, ZNF263, ASCL1, SP1, and ZEB1, may constitute the complex sequence patterns for CTCF-mediated chromatin loop formation. DeepCTCFLoop has also been applied to disease-associated sequence variants to identify candidates that may disrupt chromatin loop formation. Therefore, our results provide useful information for understanding the mechanism of 3D genome organization and may also help annotate and prioritize the noncoding sequence variants associated with human diseases.
- Subjects :
- CCCTC-Binding Factor
Computational biology
Biology
Cell Line
03 medical and health sciences
chemistry.chemical_compound
Deep Learning
0302 clinical medicine
Genetics
Humans
Genetic Predisposition to Disease
Nucleotide Motifs
Molecular Biology
Transcription factor
030304 developmental biology
Genomic organization
0303 health sciences
Binding Sites
Computational Biology
DNA
Sequence Analysis, DNA
Chromatin
Computational Mathematics
Computational Theory and Mathematics
chemistry
CTCF
030220 oncology & carcinogenesis
Modeling and Simulation
Human genome
Chromatin Loop
K562 Cells
Sequence motif
HeLa Cells
Transcription Factors
Subjects
Details
- ISSN :
- 15578666
- Volume :
- 28
- Database :
- OpenAIRE
- Journal :
- Journal of Computational Biology
- Accession number :
- edsair.doi.dedup.....d0df3f8c7c5b0436d05b1739ee9afa3e
- Full Text :
- https://doi.org/10.1089/cmb.2020.0225