Back to Search Start Over

Transposable element sequence fragments incorporated into coding and noncoding transcripts modulate the transcriptome of human pluripotent stem cells

Authors :
Jon Frampton
Yuhao Li
Qiang Zhuang
Jiangping He
Boping Deng
Miguel A. Esteban
Andrew P. Hutchins
Mazid Md. Abdul
Jean-Baptiste Cazier
Gang Ma
Minchun Chen
Isaac A. Babarinde
Xiuling Fu
Wenjuan Li
Zhiwei Luo
Liyang Shi
Jiekai Chen
Martha Duttlinger
Li Sun
Hao Liu
Guoqing Tong
Ralf Jauch
Carl Ward
Source :
Nucleic Acids Research
Publication Year :
2021
Publisher :
Oxford University Press (OUP), 2021.

Abstract

Transposable elements (TEs) occupy nearly 40% of mammalian genomes and, whilst most are fragmentary and no longer capable of transposition, they can nevertheless contribute to cell function. TEs within genes transcribed by RNA polymerase II can be copied as parts of primary transcripts; however, their full contribution to mature transcript sequences remains unresolved. Here, using long and short read (LR and SR) RNA sequencing data, we show that 26% of coding and 65% of noncoding transcripts in human pluripotent stem cells (hPSCs) contain TE-derived sequences. Different TE families are incorporated into RNAs in unique patterns, with consequences to transcript structure and function. The presence of TE sequences within a transcript is correlated with TE-type specific changes in its subcellular distribution, alterations in steady-state levels and half-life, and differential association with RNA Binding Proteins (RBPs). We identify hPSC-specific incorporation of endogenous retroviruses (ERVs) and LINE:L1 into protein-coding mRNAs, which generate TE sequence-derived peptides. Finally, single cell RNA-seq reveals that hPSCs express ERV-containing transcripts, whilst differentiating subpopulations lack ERVs and express SINE and LINE-containing transcripts. Overall, our comprehensive analysis demonstrates that the incorporation of TE sequences into the RNAs of hPSCs is more widespread and has a greater impact than previously appreciated.

Details

ISSN :
13624962 and 03051048
Volume :
49
Database :
OpenAIRE
Journal :
Nucleic Acids Research
Accession number :
edsair.doi.dedup.....f05f75530169b12b67f996893c16dc45
Full Text :
https://doi.org/10.1093/nar/gkab710