1. Protein-Protein interactions uncover candidate ‘core genes’ within omnigenic disease networks
- Author
-
Abhirami Ratnakumar, Nadeem Riaz, Nils Weinhold, and Jessica C. Mar
- Subjects
Proteomics ,Cancer Research ,Gene Identification and Analysis ,Genome-wide association study ,Disease ,Genetic Networks ,QH426-470 ,Biochemistry ,Amyloid beta-Protein Precursor ,0302 clinical medicine ,Risk Factors ,Breast Tumors ,Medicine and Health Sciences ,Insulin ,Protein Interaction Maps ,Genetics (clinical) ,0303 health sciences ,Drug discovery ,BRCA1 Protein ,Genomics ,Oncology ,Protein Interaction Networks ,Female ,Proprotein Convertase 9 ,Network Analysis ,Research Article ,Computer and Information Sciences ,Single-nucleotide polymorphism ,Breast Neoplasms ,Computational biology ,Biology ,Polymorphism, Single Nucleotide ,Protein–protein interaction ,03 medical and health sciences ,Germline mutation ,Interaction network ,Alzheimer Disease ,Breast Cancer ,Genome-Wide Association Studies ,Genetics ,Humans ,Molecular Biology ,Gene ,Mutation Detection ,Ecology, Evolution, Behavior and Systematics ,030304 developmental biology ,Biology and Life Sciences ,Computational Biology ,Cancers and Neoplasms ,Human Genetics ,Genome Analysis ,Diabetes Mellitus, Type 2 ,Genetic Loci ,Mutation ,Somatic Mutation ,030217 neurology & neurosurgery ,Genome-Wide Association Study - Abstract
Genome wide association studies (GWAS) of human diseases have generally identified many loci associated with risk with relatively small effect sizes. The omnigenic model attempts to explain this observation by suggesting that diseases can be thought of as networks, where genes with direct involvement in disease-relevant biological pathways are named ‘core genes’, while peripheral genes influence disease risk via their interactions or regulatory effects on core genes. Here, we demonstrate a method for identifying candidate core genes solely from genes in or near disease-associated SNPs (GWAS hits) in conjunction with protein-protein interaction network data. Applied to 1,381 GWAS studies from 5 ancestries, we identify a total of 1,865 candidate core genes in 343 GWAS studies. Our analysis identifies several well-known disease-related genes that are not identified by GWAS, including BRCA1 in Breast Cancer, Amyloid Precursor Protein (APP) in Alzheimer’s Disease, INS in A1C measurement and Type 2 Diabetes, and PCSK9 in LDL cholesterol, amongst others. Notably candidate core genes are preferentially enriched for disease relevance over GWAS hits and are enriched for both Clinvar pathogenic variants and known drug targets—consistent with the predictions of the omnigenic model. We subsequently use parent term annotations provided by the GWAS catalog, to merge related GWAS studies and identify candidate core genes in over-arching disease processes such as cancer–where we identify 109 candidate core genes., Author summary A recent theory suggests that only a small number of genes underpin the biology of a disease, these genes are called ‘core genes’, and for most diseases, these core genes remain unknown. The suggested methods for finding them requires complex and expensive experiments. We reasoned that if we merge currently available datasets in smart ways, we may be able to uncover these ‘core genes’. Our method finds “hub” proteins by merging lists of genes previously linked with disease to information on how proteins interact with each other. We found that many of these hub proteins have central roles in disease, such as insulin for both A1C measurement and Type 2 Diabetes, BRCA1 in Breast cancer, and Amyloid Precursor Protein in Alzheimer’s Disease. We think these ‘hub’ proteins are candidate ‘core genes’, and offer our method as a way to find ‘core genes’ by utilizing publicly available reference datasets.
- Published
- 2020