Back to Search
Start Over
Evidence of transcription at polyT short tandem repeats
- Publication Year :
- 2021
- Publisher :
- HAL CCSD, 2021.
-
Abstract
- BackgroundUsing the Cap Analysis of Gene Expression technology, the FANTOM5 consortium provided one of the most comprehensive maps of Transcription Start Sites (TSSs) in several species. Strikingly, ~72% of them could not be assigned to a specific gene and initiate at unconventional regions, outside promoters or enhancers.ResultsHere, we probe these unassigned TSSs and show that, in all species studied, a significant fraction of CAGE peaks initiate at short tandem repeats (STRs) corresponding to homopolymers of thymidines (T). Additional analyse confirm that these CAGEs are truly associated with transcriptionally active chromatin marks. Furthermore, we train a sequence-based deep learning model able to predict CAGE signal at T STRs with high accuracy (~81%) Extracting features learned by this model reveals that transcription at T STRs is mostly directed by STR length but also instructions lying in the downstream sequence. Excitingly, our model also predicts that genetic variants linked to human diseases affect this STR-associated transcription.ConclusionsTogether, our results extend the repertoire of non-coding transcription associated with DNA tandem repeats and complexify STR polymorphism. We also provide a new metric that can be considered in future studies of STR-related complex traits.
- Subjects :
- 0303 health sciences
microsatellite
Repertoire
short tandem repeat
deep learning
Promoter
Computational biology
Biology
non-coding transcription
medicine.disease
[SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM]
Cap analysis gene expression
humanities
03 medical and health sciences
0302 clinical medicine
Transcription (biology)
[SDV.BBM.GTP]Life Sciences [q-bio]/Biochemistry, Molecular Biology/Genomics [q-bio.GN]
medicine
Microsatellite
Enhancer
Gene
030217 neurology & neurosurgery
Transcriptional noise
030304 developmental biology
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Accession number :
- edsair.doi.dedup.....25390ff2bc0d13dbcf9aec51d8dc062f