1. G-quadruplex structural variations in human genome associated with single-nucleotide variations and their impact on gene activity
- Author
-
Juan-Nan Chen, Cui-Jiao Wen, Ming-Liang Tang, Jia-Yuan Gong, Zheng Tan, Rui-Fang Duan, Yu-hua Hao, Yi-de He, Jia-yu Zhang, Ke-wei Zheng, Su-Ping Ren, and Qun Yu
- Subjects
Transcriptional Activation ,0301 basic medicine ,dbSNP ,Regulatory Sequences, Nucleic Acid ,Biology ,010402 general chemistry ,G-quadruplex ,Polymorphism, Single Nucleotide ,01 natural sciences ,03 medical and health sciences ,Transcription (biology) ,Genetic variation ,Humans ,Promoter Regions, Genetic ,Enhancer ,Gene ,Genetics ,Multidisciplinary ,Genome, Human ,Biological Sciences ,0104 chemical sciences ,DNA-Binding Proteins ,G-Quadruplexes ,030104 developmental biology ,Gene Expression Regulation ,Nucleic acid ,Nucleic Acid Conformation ,Human genome ,Transcription Initiation Site ,Protein Binding - Abstract
G-quadruplexes (G4s) formed by guanine-rich nucleic acids play a role in essential biological processes such as transcription and replication. Besides the >1.5 million putative G-4–forming sequences (PQSs), the human genome features >640 million single-nucleotide variations (SNVs), the most common type of genetic variation among people or populations. An SNV may alter a G4 structure when it falls within a PQS motif. To date, genome-wide PQS–SNV interactions and their impact have not been investigated. Herein, we present a study on the PQS–SNV interactions and the impact they can bring to G4 structures and, subsequently, gene expressions. Based on build 154 of the Single Nucleotide Polymorphism Database (dbSNP), we identified 5 million gains/losses or structural conversions of G4s that can be caused by the SNVs. Of these G4 variations (G4Vs), 3.4 million are within genes, resulting in an average load of >120 G4Vs per gene, preferentially enriched near the transcription start site. Moreover, >80% of the G4Vs overlap with transcription factor–binding sites and >14% with enhancers, giving an average load of 3 and 7.5 for the two regulatory elements, respectively. Our experiments show that such G4Vs can significantly influence the expression of their host genes. These results reveal genome-wide G4Vs and their impact on gene activity, emphasizing an understanding of genetic variation, from a structural perspective, of their physiological function and pathological implications. The G4Vs may also provide a unique category of drug targets for individualized therapeutics, health risk assessment, and drug development.
- Published
- 2021