16 results on '"Segun Jung"'
Search Results
2. A novel MERTK mutation causing retinitis pigmentosa
- Author
-
Segun Jung, Kaanan P. Shah, Michael A. Grassi, Ravi Madduri, Hasenin Al-khersan, and Alex Rodriguez
- Subjects
Male ,0301 basic medicine ,Proband ,DNA Mutational Analysis ,Nonsense mutation ,Biology ,Retina ,Article ,03 medical and health sciences ,Cellular and Molecular Neuroscience ,0302 clinical medicine ,Locus heterogeneity ,medicine ,Humans ,Exome ,Exome sequencing ,Genetic testing ,Genetics ,c-Mer Tyrosine Kinase ,medicine.diagnostic_test ,Genetic heterogeneity ,DNA ,MERTK ,medicine.disease ,Sensory Systems ,Pedigree ,Ophthalmoscopy ,Ophthalmology ,030104 developmental biology ,Mutation ,030221 ophthalmology & optometry ,Female ,Allelic heterogeneity ,Retinitis Pigmentosa - Abstract
Retinitis pigmentosa (RP) is a genetically heterogeneous inherited retinal dystrophy. To date, over 80 genes have been implicated in RP. However, the disease demonstrates significant locus and allelic heterogeneity not entirely captured by current testing platforms. The purpose of the present study was to characterize the underlying mutation in a patient with RP without a molecular diagnosis after initial genetic testing. Whole-exome sequencing of the affected proband was performed. Candidate gene mutations were selected based on adherence to expected genetic inheritance pattern and predicted pathogenicity. Sanger sequencing of MERTK was completed on the patient’s unaffected mother, affected brother, and unaffected sister to determine genetic phase. Eight sequence variants were identified in the proband in known RP-associated genes. Sequence analysis revealed that the proband was a compound heterozygote with two independent mutations in MERTK, a novel nonsense mutation (c.2179C > T) and a previously reported missense variant (c.2530C > T). The proband’s affected brother also had both mutations. Predicted phase was confirmed in unaffected family members. Our study identifies a novel nonsense mutation in MERTK in a family with RP and no prior molecular diagnosis. The present study also demonstrates the clinical value of exome sequencing in determining the genetic basis of Mendelian diseases when standard genetic testing is unsuccessful.
- Published
- 2017
- Full Text
- View/download PDF
3. Genetic deletion of Sphk2 confers protection against Pseudomonas aeruginosa mediated differential expression of genes related to virulent infection and inflammation in mouse lung
- Author
-
David L. Ebenezer, Zarema Arbieva, Ravi Madduri, Mark Maienschein-Cline, Yashaswin Krishnan, Hong Hu, Viswanathan Natarajan, Anantha Harijith, Panfeng Fu, and Segun Jung
- Subjects
lcsh:QH426-470 ,lcsh:Biotechnology ,Inflammation ,Biology ,medicine.disease_cause ,Microbiology ,Transcriptome ,03 medical and health sciences ,Mice ,0302 clinical medicine ,lcsh:TP248.13-248.65 ,Gene expression ,Genomics, bacterial resistance ,Genetics ,medicine ,Sphingosine kinase 2 ,Animals ,Gene Regulatory Networks ,Pseudomonas Infections ,RNA-Seq ,Lung ,030304 developmental biology ,Regulation of gene expression ,0303 health sciences ,Sphingolipids ,Analysis of Variance ,Virulence ,Pseudomonas aeruginosa ,Gene Expression Profiling ,Wild type ,High-Throughput Nucleotide Sequencing ,Pneumonia ,3. Good health ,lcsh:Genetics ,SPHK2 ,Disease Models, Animal ,Phosphotransferases (Alcohol Group Acceptor) ,Gene Expression Regulation ,030220 oncology & carcinogenesis ,Female ,Signal transduction ,medicine.symptom ,Gene Deletion ,Biotechnology ,Research Article - Abstract
BackgroundPseudomonas aeruginosa(PA) is an opportunistic Gram-negative bacterium that causes serious life threatening and nosocomial infections including pneumonia.PAhas the ability to alter host genome to facilitate its invasion, thus increasing the virulence of the organism. Sphingosine-1- phosphate (S1P), a bioactive lipid, is known to play a key role in facilitating infection. Sphingosine kinases (SPHK) 1&2 phosphorylate sphingosine to generate S1P in mammalian cells. We reported earlier thatSphk2−/−mice offered significant protection against lung inflammation, compared to wild type (WT) animals. Therefore, we profiled the differential expression of genes between the protected group ofSphk2−/−and the wild type controls to better understand the underlying protective mechanisms related to theSphk2deletion in lung inflammatory injury. Whole transcriptome shotgun sequencing (RNA-Seq) was performed on mouse lung tissue using NextSeq 500 sequencing system.ResultsTwo-way analysis of variance (ANOVA) analysis was performed and differentially expressed genes followingPAinfection were identified using whole transcriptome ofSphk2−/−mice and their WT counterparts. Pathway (PW) enrichment analyses of the RNA seq data identified several signaling pathways that are likely to play a crucial role in pneumonia caused byPAsuch as those involved in: 1. Immune response toPAinfection and NF-κB signal transduction; 2. PKC signal transduction; 3. Impact on epigenetic regulation; 4. Epithelial sodium channel pathway; 5. Mucin expression; and 6. Bacterial infection related pathways.Our genomic data suggests a potential role for SPHK2 inPA-induced pneumonia through elevated expression of inflammatory genes in lung tissue. Further, validation by RT-PCR on 10 differentially expressed genes showed 100% concordance in terms of vectoral changes as well as significant fold change.ConclusionUsingSphk2−/−mice and differential gene expression analysis, we have shown here that S1P/SPHK2 signaling could play a key role in promotingPApneumonia. The identified genes promote inflammation and suppress others that naturally inhibit inflammation and host defense. Thus, targeting SPHK2/S1P signaling inPA-induced lung inflammation could serve as a potential therapy to combatPA-induced pneumonia.
- Published
- 2019
4. Identification and validation of regulatory SNPs that modulate transcription factor chromatin binding and gene expression in prostate cancer
- Author
-
Segun Jung, Ramana V. Davuluri, Hongjian Jin, and Auditi R. DebRoy
- Subjects
Male ,0301 basic medicine ,SNP ,Single-nucleotide polymorphism ,Genome-wide association study ,Kaplan-Meier Estimate ,Biology ,eQTL ,Polymorphism, Single Nucleotide ,03 medical and health sciences ,0302 clinical medicine ,Cell Line, Tumor ,Humans ,Genetic Predisposition to Disease ,Enhancer ,CRISPR/Cas9 ,Gene ,Alleles ,transcription factor ,Genetics ,Base Sequence ,Chromatin binding ,Prostatic Neoplasms ,prostate cancer ,Chromatin ,3. Good health ,Gene Expression Regulation, Neoplastic ,030104 developmental biology ,Oncology ,030220 oncology & carcinogenesis ,Expression quantitative trait loci ,Functional genomics ,Chromatin immunoprecipitation ,Genome-Wide Association Study ,Protein Binding ,Transcription Factors ,Research Paper - Abstract
// Hong-Jian Jin 1 , Segun Jung 1 , Auditi R. DebRoy 1 , Ramana V. Davuluri 1 1 Division of Health and Biomedical Informatics, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA Correspondence to: Ramana V. Davuluri, email: ramana.davuluri@northwestern.edu Keywords: SNP, prostate cancer, transcription factor, CRISPR/Cas9, eQTL Received: March 23, 2016 Accepted: May 23, 2016 Published: July 09, 2016 ABSTRACT Prostate cancer (PCa) is the second most common solid tumor for cancer related deaths in American men. Genome wide association studies (GWAS) have identified single nucleotide polymorphisms (SNPs) associated with the increased risk of PCa. Because most of the susceptibility SNPs are located in noncoding regions, little is known about their functional mechanisms. We hypothesize that functional SNPs reside in cell type-specific regulatory elements that mediate the binding of critical transcription factors (TFs), which in turn result in changes in target gene expression. Using PCa-specific functional genomics data, here we identify 38 regulatory candidate SNPs and their target genes in PCa. Through risk analysis by incorporating gene expression and clinical data, we identify 6 target genes (ZG16B, ANKRD5, RERE, FAM96B, NAALADL2 and GTPBP10) as significant predictors of PCa biochemical recurrence. In addition, 5 SNPs (rs2659051, rs10936845, rs9925556, rs6057110 and rs2742624) are selected for experimental validation using Chromatin immunoprecipitation (ChIP), dual-luciferase reporter assay in LNCaP cells, showing allele-specific enhancer activity. Furthermore, we delete the rs2742624-containing region using CRISPR/Cas9 genome editing and observe the drastic downregulation of its target gene UPK3A. Taken together, our results illustrate that this new methodology can be applied to identify regulatory SNPs and their target genes that likely impact PCa risk. We suggest that similar studies can be performed to characterize regulatory variants in other diseases.
- Published
- 2016
- Full Text
- View/download PDF
5. Atlas of Transcription Factor Binding Sites from ENCODE DNase Hypersensitivity Data across 27 Tissue Types
- Author
-
Cory C. Funk, Nilufer Ertekin-Taner, Leroy Hood, Matthew A. Richards, Nathan D. Price, Alex Rodriguez, Gustavo Glusman, Yukai Xiao, Alex M. Casella, Segun Jung, Ben Heavner, Ian Foster, Kyle Chard, Paul Shannon, Rory Donovan-Maiye, Carl Kesselman, John D. Van Horn, Todd E. Golde, Arthur W. Toga, Ravi Madduri, and Seth A. Ament
- Subjects
0301 basic medicine ,genetic processes ,information science ,Gene regulatory network ,Computational biology ,Biology ,ENCODE ,General Biochemistry, Genetics and Molecular Biology ,DNase-Seq ,Article ,03 medical and health sciences ,0302 clinical medicine ,Genetic variation ,Humans ,natural sciences ,Transcription factor ,Binding Sites ,Deoxyribonucleases ,Genomics ,Footprinting ,DNA binding site ,030104 developmental biology ,health occupations ,Human genome ,Sequence motif ,Hypersensitive site ,030217 neurology & neurosurgery ,Transcription Factors - Abstract
There is intense interest in mapping the tissue-specific binding sites of transcription factors in the human genome to reconstruct gene regulatory networks and predict functions for non-coding genetic variation. DNase-seq footprinting provides a means to predict genome-wide binding sites for hundreds of transcription factors (TFs) simultaneously. However, despite the public availability of DNase-seq data for hundreds of samples, there is neither a unified analytical workflow nor a publicly accessible database providing the locations of footprints across all available samples. Here, we implemented a workflow for uniform processing of footprints using two state-of-the-art footprinting algorithms: Wellington and HINT. Our workflow scans the footprints generated by these algorithms for 1,530 sequence motifs to predict binding sites for 1,515 human transcription factors. We applied our workflow to detect footprints in 192 DNase-seq experiments from ENCODE spanning 27 human tissues. This collection of footprints describes an expansive landscape of potential TF occupancy. At thresholds optimized through machine learning, we report high-quality footprints covering 9.8% of the human genome. These footprints were enriched for true positive TF binding sites as defined by ChIP-seq peaks, as well as for genetic variants associated with changes in gene expression. Integrating our footprint atlas with summary statistics from genome-wide association studies revealed that risk for neuropsychiatric traits was enriched specifically at highly-scoring footprints in human brain, while risk for immune traits was enriched specifically at highly-scoring footprints in human lymphoblasts. Our cloud-based workflow is available at github.com/globusgenomics/genomics-footprint and a database with all footprints and TF binding site predictions are publicly available at http://data.nemoarchive.org/other/grant/sament/sament/footprint_atlas.
- Published
- 2018
6. O3‐03‐01: MECHANISTIC AND DIRECTIONAL TRANSCRIPTIONAL REGULATORY NETWORKS IN ALZHEIMER'S DISEASE
- Author
-
Matthew A. Richards, Karen N. McFarland, Segun Jung, Nathan D. Price, Alex Rodriguez, Todd E. Golde, Paul Shannon, Paramita Chakrabarty, Mariet Allen, Ravi Madduri, Minerva M. Carrasquillo, Ian Foster, Nilufer Ertekin-Taner, Cory C. Funk, Leroy Hood, Rory Donovan-Maiye, Seth A. Ament, Max Robinson, and Noa Rappaport
- Subjects
Psychiatry and Mental health ,Cellular and Molecular Neuroscience ,Developmental Neuroscience ,Epidemiology ,Health Policy ,Neurology (clinical) ,Computational biology ,Disease ,Geriatrics and Gerontology ,Biology - Published
- 2018
- Full Text
- View/download PDF
7. Expression profiling of genes regulated by Sphingosine kinase 2 in a murine model of Pseudomonas aeruginosa mediated acute lung inflammation
- Author
-
Anantha Harijith, Panfeng Fu, Yashaswin Krishnan, Ravi Madduri, Hong Hu, David L. Ebenezer, Segun Jung, Zarema Arbieva, and Viswanathan Natarajan
- Subjects
Lung ,Pseudomonas aeruginosa ,Sphingosine Kinase 2 ,Inflammation ,Biology ,medicine.disease_cause ,Biochemistry ,Gene expression profiling ,medicine.anatomical_structure ,Murine model ,Genetics ,medicine ,Cancer research ,medicine.symptom ,Molecular Biology ,Gene ,Biotechnology - Published
- 2018
- Full Text
- View/download PDF
8. Interconversion between Parallel and Antiparallel Conformations of a 4H RNA junction in Domain 3 of Foot-and-Mouth Disease Virus IRES Captured by Dynamics Simulations
- Author
-
Tamar Schlick and Segun Jung
- Subjects
Principal Component Analysis ,Base Sequence ,Biophysics ,Stacking ,RNA ,Molecular Dynamics Simulation ,Biology ,Antiparallel (biochemistry) ,Internal ribosome entry site ,Crystallography ,Molecular dynamics ,Förster resonance energy transfer ,Foot-and-Mouth Disease Virus ,Nucleic Acid Conformation ,RNA, Viral ,Proteins and Nucleic Acids ,Peptide Chain Initiation, Translational ,Conformational isomerism ,Protein secondary structure - Abstract
RNA junctions are common secondary structural elements present in a wide range of RNA species. They play crucial roles in directing the overall folding of RNA molecules as well as in a variety of biological functions. In particular, there has been great interest in the dynamics of RNA junctions, including conformational pathways of fully base-paired 4-way (4H) RNA junctions. In such constructs, all nucleotides participate in one of the four double-stranded stem regions, with no connecting loops. Dynamical aspects of these 4H RNAs are interesting because frequent interchanges between parallel and antiparallel conformations are thought to occur without binding of other factors. Gel electrophoresis and single-molecule fluorescence resonance energy transfer experiments have suggested two possible pathways: one involves a helical rearrangement via disruption of coaxial stacking, and the other occurs by a rotation between the helical axes of coaxially stacked conformers. Employing molecular dynamics simulations, we explore this conformational variability in a 4H junction derived from domain 3 of the foot-and-mouth disease virus internal ribosome entry site (IRES); this junction contains highly conserved motifs for RNA-RNA and RNA-protein interactions, important for IRES activity. Our simulations capture transitions of the 4H junction between parallel and antiparallel conformations. The interconversion is virtually barrier-free and occurs via a rotation between the axes of coaxially stacked helices with a transient perpendicular intermediate. We characterize this transition, with various interhelical orientations, by pseudodihedral angle and interhelical distance measures. The high flexibility of the junction, as also demonstrated experimentally, is suitable for IRES activity. Because foot-and-mouth disease virus IRES structure depends on long-range interactions involving domain 3, the perpendicular intermediate, which maintains coaxial stacking of helices and thereby consensus primary and secondary structure information, may be beneficial for guiding the overall organization of the RNA system in domain 3.
- Published
- 2014
- Full Text
- View/download PDF
9. Identification of Genetic and Epigenetic Variants Associated with Breast Cancer Prognosis by Integrative Bioinformatics Analysis
- Author
-
Ramana V. Davuluri, Arunima Shilpi, Yingtao Bi, Segun Jung, and Samir Kumar Patra
- Subjects
0301 basic medicine ,Cancer Research ,overall survival ,Single-nucleotide polymorphism ,Genomics ,Computational biology ,Quantitative trait locus ,Biology ,Bioinformatics ,lcsh:RC254-282 ,03 medical and health sciences ,0302 clinical medicine ,Breast cancer ,medicine ,Epigenetics ,Original Research ,Epigenomics ,DNA methylation ,meQTLs ,single-nucleotide polymorphism ,medicine.disease ,lcsh:Neoplasms. Tumors. Oncology. Including cancer and carcinogens ,3. Good health ,030104 developmental biology ,Oncology ,030220 oncology & carcinogenesis ,SNP array - Abstract
IntroductionBreast cancer being a multifaceted disease constitutes a wide spectrum of histological and molecular variability in tumors. However, the task for the identification of these variances is complicated by the interplay between inherited genetic and epigenetic aberrations. Therefore, this study provides an extrapolate outlook to the sinister partnership between DNA methylation and single-nucleotide polymorphisms (SNPs) in relevance to the identification of prognostic markers in breast cancer. The effect of these SNPs on methylation is defined as methylation quantitative trait loci (meQTL).Materialsand MethodsWe developed a novel method to identify prognostic gene signatures for breast cancer by integrating genomic and epigenomic data. This is based on the hypothesis that multiple sources of evidence pointing to the same gene or pathway are likely to lead to reduced false positives. We also apply random resampling to reduce overfitting noise by dividing samples into training and testing data sets. Specifically, the common samples between Illumina 450 DNA methylation, Affymetrix SNP array, and clinical data sets obtained from the Cancer Genome Atlas (TCGA) for breast invasive carcinoma (BRCA) were randomly divided into training and test models. An intensive statistical analysis based on log-rank test and Cox proportional hazard model has established a significant association between differential methylation and the stratification of breast cancer patients into high- and low-risk groups, respectively.ResultsThe comprehensive assessment based on the conjoint effect of CpG–SNP pair has guided in delaminating the breast cancer patients into the high- and low-risk groups. In particular, the most significant association was found with respect to cg05370838–rs2230576, cg00956490–rs940453, and cg11340537–rs2640785 CpG–SNP pairs. These CpG–SNP pairs were strongly associated with differential expression of ADAM8, CREB5, and EXPH5 genes, respectively. Besides, the exclusive effect of SNPs such as rs10101376, rs140679, and rs1538146 also hold significant prognostic determinant.ConclusionsThus, the analysis based on DNA methylation and SNPs have resulted in the identification of novel susceptible loci that hold prognostic relevance in breast cancer.
- Published
- 2017
10. Candidate RNA structures for domain 3 of the foot-and-mouth-disease virus internal ribosome entry site
- Author
-
Segun Jung and Tamar Schlick
- Subjects
Models, Molecular ,Computational biology ,Biology ,Molecular Dynamics Simulation ,010402 general chemistry ,01 natural sciences ,Tetraloop ,03 medical and health sciences ,Eukaryotic translation ,Untranslated Regions ,Genetics ,Nucleic acid structure ,Binding site ,Peptide Chain Initiation, Translational ,Conserved Sequence ,030304 developmental biology ,0303 health sciences ,Base Sequence ,RNA ,Computational Biology ,Translation (biology) ,Virology ,Protein tertiary structure ,0104 chemical sciences ,Internal ribosome entry site ,Foot-and-Mouth Disease Virus ,Nucleic Acid Conformation ,RNA, Viral - Abstract
The foot-and-mouth-disease virus (FMDV) utilizes non-canonical translation initiation for viral protein synthesis, by forming a specific RNA structure called internal ribosome entry site (IRES). Domain 3 in FMDV IRES is phylogenetically conserved and highly structured; it contains four-way junctions where intramolecular RNA-RNA interactions serve as a scaffold for the RNA to fold for efficient IRES activity. Although the 3D structure of domain 3 is crucial to exploring and deciphering the initiation mechanism of translation, little is known. Here, we employ a combination of various modeling approaches to propose candidate tertiary structures for the apical region of domain 3, thought to be crucial for IRES function. We begin by modeling junction topology candidates and build atomic 3D models consistent with available experimental data. We then investigate each of the four candidate 3D structures by molecular dynamics simulations to determine the most energetically favorable configurations and to analyze specific tertiary interactions. Only one model emerges as viable containing not only the specific binding site for the GNRA tetraloop but also helical arrangements which enhance the stability of domain 3. These collective findings, together with available experimental data, suggest a plausible theoretical tertiary structure of the apical region in FMDV IRES domain 3.
- Published
- 2012
11. Tertiary Motifs Revealed in Analyses of Higher-Order RNA Junctions
- Author
-
Tamar Schlick, Abdul Iqbal, Segun Jung, and Christian Laing
- Subjects
Models, Molecular ,Base pair ,Stacking ,Protein Data Bank (RCSB PDB) ,RNA ,Biology ,Article ,Crystallography ,chemistry.chemical_compound ,Models, Chemical ,chemistry ,Structural Biology ,Chemical physics ,Nucleic Acid Conformation ,Nucleic acid structure ,Coaxial ,Base Pairing ,Molecular Biology ,Cytosine - Abstract
RNA junctions are secondary structure elements formed when three or more helices come together. They are present in diverse RNA molecules with various fundamental functions in the cell. To better understand the intricate architecture of three-dimensional RNAs, we analyze currently solved 3D RNA junctions in terms of basepair interactions and three-dimensional configurations. First, we study basepair interaction diagrams for solved RNA junctions with five to ten helices and discuss common features. Second, we compare these higher-order junctions to those containing three or four helices and identify global motif patterns such as coaxial-stacking and parallel and perpendicular helical configurations. These analyses show that higher order junctions organize their helical components in parallel and helical configurations similar to lower order junctions. Their sub-junctions also resemble local helical configurations found in three and four-way junctions, and are stabilized by similar long-range interaction preferences such as A-minor interactions. Furthermore, loop regions within junctions are high in adenine but low in cytosine. And, in agreement with previous studies, we suggest that coaxial stacking between helices likely forms when the common single stranded loop is small in size; however, other factors such as stacking interactions involving non-canonical basepairs and proteins can greatly determine or disrupt coaxial stacking. Finally, we introduce the ribo-base interactions: when combined with the along-groove packing motif, these ribo-base interactions form novel motifs involved in perpendicular helix-helix interactions. Overall, these analyses suggest recurrent tertiary motifs that stabilize junction architecture, pack helices, and help form helical configurations that occur as sub-elements of larger junction networks. The frequent occurrence of similar helical motifs suggest Nature’s finite and perhaps limited repertoire of RNA helical conformation preferences. More generally, studies of RNA junctions and tertiary building blocks can ultimately help in the difficult task of RNA 3D structure prediction.
- Published
- 2009
- Full Text
- View/download PDF
12. Evaluation of data discretization methods to derive platform independent isoform expression signatures for multi-class tumor subtyping
- Author
-
Segun Jung, Yingtao Bi, and Ramana V. Davuluri
- Subjects
Discretization ,exon-array ,Feature selection ,Biology ,Machine Learning ,Multiclass classification ,03 medical and health sciences ,0302 clinical medicine ,RNA Isoforms ,Genetics ,Cluster Analysis ,Humans ,030304 developmental biology ,data discretization ,0303 health sciences ,business.industry ,Gene Expression Profiling ,Research ,Computational Biology ,platform transition ,Pattern recognition ,multi-class classification ,Class (biology) ,Expression (mathematics) ,Random forest ,Statistical classification ,Identification (information) ,ComputingMethodologies_PATTERNRECOGNITION ,030220 oncology & carcinogenesis ,Artificial intelligence ,RNA-seq ,Glioblastoma ,business ,Algorithms ,Biotechnology - Abstract
Background Many supervised learning algorithms have been applied in deriving gene signatures for patient stratification from gene expression data. However, transferring the multi-gene signatures from one analytical platform to another without loss of classification accuracy is a major challenge. Here, we compared three unsupervised data discretization methods--Equal-width binning, Equal-frequency binning, and k-means clustering--in accurately classifying the four known subtypes of glioblastoma multiforme (GBM) when the classification algorithms were trained on the isoform-level gene expression profiles from exon-array platform and tested on the corresponding profiles from RNA-seq data. Results We applied an integrated machine learning framework that involves three sequential steps; feature selection, data discretization, and classification. For models trained and tested on exon-array data, the addition of data discretization step led to robust and accurate predictive models with fewer number of variables in the final models. For models trained on exon-array data and tested on RNA-seq data, the addition of data discretization step dramatically improved the classification accuracies with Equal-frequency binning showing the highest improvement with more than 90% accuracies for all the models with features chosen by Random Forest based feature selection. Overall, SVM classifier coupled with Equal-frequency binning achieved the best accuracy (> 95%). Without data discretization, however, only 73.6% accuracy was achieved at most. Conclusions The classification algorithms, trained and tested on data from the same platform, yielded similar accuracies in predicting the four GBM subgroups. However, when dealing with cross-platform data, from exon-array to RNA-seq, the classifiers yielded stable models with highest classification accuracies on data transformed by Equal frequency binning. The approach presented here is generally applicable to other cancer types for classification and identification of molecular subgroups by integrating data across different gene expression platforms.
- Published
- 2015
- Full Text
- View/download PDF
13. Identification of candidate regulatory SNPs by integrative analysis for prostate cancer genome data
- Author
-
Hongjian Jin, Ramana V. Davuluri, and Segun Jung
- Subjects
Genetics ,Regulation of gene expression ,SNP ,Single-nucleotide polymorphism ,RNA-Seq ,Genome-wide association study ,Biology ,AURKB Gene ,SNP array ,Genetic association - Abstract
Genome-wide association studies (GWAS) have identified numerous single nucleotide polymorphisms (SNPs), also known as generic variants, associated with disease susceptibility. Prostate cancer (PCa) is a highly heritable disease. GWAS studies have so far reported more than 70 SNPs that are associated with PCa risk. However, most of these SNPs are located in the noncoding genomic regions that little are known about their functional roles. Here we describe an informatics system that performs an integrative analysis of ChIP-seq, RNA-seq, SNP array and clinical data for identifying candidate regulatory SNPs (rSNPs) that could alter transcription factor (TF) binding sites and neighboring gene regulation. By applying the informatics framework on HOXB13 TF in PCa, we identified 213 rSNPs that include a recently discovered rSNP (rs339331) and identified a novel candidate rSNP (rs1476161) associated with the PCa risk. We confirmed rs1476161 by performing the HOXB13 knockout experiment. The expression level the target gene, AURKB, was decreased by about 2-fold in HOXB13-silencing cells compared to the control cells. This indicates the involvement of HOXB13 in altering AURKB gene expression, suggesting a critical role of rs1476161 in allele-specific gene regulation. Taken together, the results demonstrate the feasibility of our system in searching for candidate rSNPs associated with PCa risk.
- Published
- 2015
- Full Text
- View/download PDF
14. Comparison of data discretization methods for cross platform transfer of gene-expression based tumor subtyping classifier
- Author
-
Ramana V. Davuluri, Segun Jung, and Yingtao Bi
- Subjects
Discretization ,business.industry ,Molecular biophysics ,Genomics ,Feature selection ,Biology ,Machine learning ,computer.software_genre ,Subtyping ,Support vector machine ,ComputingMethodologies_PATTERNRECOGNITION ,Cross-platform ,Artificial intelligence ,Data mining ,business ,Classifier (UML) ,computer - Abstract
Molecular stratification of tumors is essential for developing personalized therapies. While patient stratification strategies have been successful, computational methods to accurately translate and integrate gene signatures across different high-throughput platforms (e.g., microarray, RNA-seq) are currently lacking. We performed comparative evaluation of different data discretization and feature selection methods combined with state-of-the-art machine learning algorithms to derive platform-independent and accurate multi-gene signatures for classification of the four known subtypes of glioblastoma. Our results show that integrative application of feature selection and data discretization is crucial for successful platform transition and higher prediction accuracy of the derived molecular classifiers.
- Published
- 2014
- Full Text
- View/download PDF
15. Naïve Bayes for microRNA target predictions--machine learning for microRNA targets
- Author
-
Louise C. Showe, Segun Jung, Michael K. Showe, Andrew V. Kossenkov, and Malik Yousef
- Subjects
Statistics and Probability ,Molecular Sequence Data ,Sequence alignment ,Biology ,Machine learning ,computer.software_genre ,Biochemistry ,Pattern Recognition, Automated ,Naive Bayes classifier ,Bayes' theorem ,Artificial Intelligence ,microRNA ,Base sequence ,Molecular Biology ,Sequence ,Base Sequence ,business.industry ,Sequence Analysis, RNA ,Gene targeting ,Pattern recognition ,Bayes Theorem ,RNA Probes ,Computer Science Applications ,Computational Mathematics ,MicroRNAs ,Computational Theory and Mathematics ,Gene Targeting ,Artificial intelligence ,Target gene ,business ,computer ,Sequence Alignment ,Algorithms - Abstract
Motivation: Most computational methodologies for miRNA:mRNA target gene prediction use the seed segment of the miRNA and require cross-species sequence conservation in this region of the mRNA target. Methods that do not rely on conservation generate numbers of predictions, which are too large to validate. We describe a target prediction method (NBmiRTar) that does not require sequence conservation, using instead, machine learning by a naïve Bayes classifier. It generates a model from sequence and miRNA:mRNA duplex information from validated targets and artificially generated negative examples. Both the ‘seed’ and ‘out-seed’ segments of the miRNA:mRNA duplex are used for target identification.Results: The application of machine-learning techniques to the features we have used is a useful and general approach for microRNA target gene prediction. Our technique produces fewer false positive predictions and fewer target candidates to be tested. It exhibits higher sensitivity and specificity than algorithms that rely on conserved genomic regions to decrease false positive predictions.Availability: The NBmiRTar program is available at http://wotan.wistar.upenn.edu/NBmiRTar/Contact: yousef@wistar.orgSupplementary information: http://wotan.wistar.upenn.edu/NBmiRTar/
- Published
- 2007
16. Predicting Helical Topologies in RNA Junctions as Tree Graphs
- Author
-
Namhee Kim, Shereef Elmetwaly, Mai Zahran, Segun Jung, Tamar Schlick, and Christian Laing
- Subjects
Models, Molecular ,Riboswitch ,RNA Folding ,Molecular Sequence Data ,Biophysics ,Structure Prediction ,lcsh:Medicine ,Biology ,Topology ,Bioinformatics ,Network topology ,Biochemistry ,Cross-validation ,03 medical and health sciences ,Molecular cell biology ,Software Design ,Nucleic acid structure ,RNA structure ,lcsh:Science ,Computerized Simulations ,Protein secondary structure ,030304 developmental biology ,0303 health sciences ,Multidisciplinary ,Base Sequence ,lcsh:R ,030302 biochemistry & molecular biology ,Computational Biology ,Reproducibility of Results ,Software Engineering ,RNA ,Genomics ,Protein structure prediction ,Nucleic acids ,Macromolecular structure analysis ,Computer Science ,Transfer RNA ,Nucleic Acid Conformation ,lcsh:Q ,Research Article - Abstract
RNA molecules are important cellular components involved in many fundamental biological processes. Understanding the mechanisms behind their functions requires knowledge of their tertiary structures. Though computational RNA folding approaches exist, they often require manual manipulation and expert intuition; predicting global long-range tertiary contacts remains challenging. Here we develop a computational approach and associated program module (RNAJAG) to predict helical arrangements/topologies in RNA junctions. Our method has two components: junction topology prediction and graph modeling. First, junction topologies are determined by a data mining approach from a given secondary structure of the target RNAs; second, the predicted topology is used to construct a tree graph consistent with geometric preferences analyzed from solved RNAs. The predicted graphs, which model the helical arrangements of RNA junctions for a large set of 200 junctions using a cross validation procedure, yield fairly good representations compared to the helical configurations in native RNAs, and can be further used to develop all-atom models as we show for two examples. Because junctions are among the most complex structural elements in RNA, this work advances folding structure prediction methods of large RNAs. The RNAJAG module is available to academic users upon request.
- Published
- 2013
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.