22 results on '"Ganapathiraju M"'
Search Results
2. Abstract # 1827 Cilia, an organelle central to psychoneuroimmunology – A characterization through its interactome
- Author
-
Ganapathiraju, M., primary, Chaparala, S., additional, and Lo, C., additional
- Published
- 2016
- Full Text
- View/download PDF
3. Analysis of LMNB1 Duplications in Autosomal Dominant Leukodystrophy Provides Insights into Duplication Mechanisms and Allele-Specific Expression
- Author
-
Giorgio, E, Rolyan, H, Kropp, L, Chakka, Ab, Yatsenko, S, Di Gregorio, E, Lacerenza, D, Vaula, G, Talarico, F, Mandich, Paola, Toro, C, Pierre, Ee, Labauge, P, Capellari, S, Cortelli, P, Vairo, Fp, Miguel, D, Stubbolo, D, Marques, Lc, Gahl, W, Boespflug Tanguy, O, Melberg, A, Hassin Baer, S, Cohen, Os, Pjontek, R, Grau, A, Klopstock, T, Fogel, B, Meijer, I, Rouleau, G, Bouchard, Jp, Ganapathiraju, M, Vanderver, A, Dahl, N, Hobson, G, Brusco, A, Brussino, A, Padiath, Qs, E. Giorgio, H. Rolyan, L. Kropp, A. B. Chakka, S. Yatsenko, E. D. Gregorio, D. Lacerenza, G. Vaula, F. Talarico, P. Mandich, C. Toro, E. E. Pierre, P. Labauge, S. Capellari, P. Cortelli, F. P. Vairo, D. Miguel, D. Stubbolo, L. C. Marque, W. Gahl, O. Boespflug-Tanguy, A. Melberg, S. Hassin-Baer, O. S. Cohen, R. Pjontek, A. Grau, T. Klopstock, B. Fogel, I. Meijer, G. Rouleau, J. L. Bouchard, M. Ganapathiraju, A. Vanderver, N. Dahl, G. Hobson, A. Brusco, A. Brussino, and Q. S. Padiath
- Subjects
metabolism [Pelizaeus-Merzbacher Disease] ,Adult ,leukodystrophy ,Pelizaeus-Merzbacher Disease ,metabolism [Lamin Type B] ,Molecular Sequence Data ,NHE ,duplication Alu ,metabolism [RNA, Messenger] ,genetics [RNA, Messenger] ,Chromosome Breakpoints ,ADLD ,autosomal dominant leukodystrophy ,Gene Duplication ,Lamin B1 ,LMNB1 ,NHEJ ,MMBIR ,FoSTeS ,Humans ,ddc:610 ,RNA, Messenger ,lamin B1 ,Research Articles ,chemistry [DNA] ,Comparative Genomic Hybridization ,genetics [DNA] ,Base Sequence ,Lamin Type B ,DNA ,genetics [Lamin Type B] ,Nucleic Acid Conformation ,genetics [Pelizaeus-Merzbacher Disease] - Abstract
Autosomal dominant leukodystrophy (ADLD) is an adult onset demyelinating disorder that is caused by duplications of the lamin B1 (LMNB1) gene. However, as only a few cases have been analyzed in detail, the mechanisms underlying LMNB1 duplications are unclear. We report the detailed molecular analysis of the largest collection of ADLD families studied, to date. We have identified the minimal duplicated region necessary for the disease, defined all the duplication junctions at the nucleotide level and identified the first inverted LMNB1 duplication. We have demonstrated that the duplications are not recurrent; patients with identical duplications share the same haplotype, likely inherited from a common founder and that the duplications originated from intrachromosomal events. The duplication junction sequences indicated that nonhomologous end joining or replication-based mechanisms such fork stalling and template switching or microhomology-mediated break induced repair are likely to be involved. LMNB1 expression was increased in patients’ fibroblasts both at mRNA and protein levels and the three LMNB1 alleles in ADLD patients show equal expression, suggesting that regulatory regions are maintained within the rearranged segment. These results have allowed us to elucidate duplication mechanisms and provide insights into allele-specific LMNB1 expression levels.
- Published
- 2013
- Full Text
- View/download PDF
4. 166. Newly discovered protein–protein interactions of schizophrenia associated genes and their biomedical significance
- Author
-
Ganapathiraju, M., primary, Sweet, R.A., additional, and Mohamed, T.P., additional
- Published
- 2011
- Full Text
- View/download PDF
5. 204. Cytokine interactome to accelerate discovery in deciphering molecular interactions among psycho neuro immuno inflammo processes
- Author
-
Ganapathiraju, M., primary and Mohamed, T.P., additional
- Published
- 2011
- Full Text
- View/download PDF
6. Comparative n-gram analysis of whole-genome protein sequences
- Author
-
Ganapathiraju, M., primary, Weisser, D., additional, Rosenfeld, R., additional, Carbonell, J., additional, Reddy, R., additional, and Klein-Seetharaman, J., additional
- Published
- 2002
- Full Text
- View/download PDF
7. An efficient heuristic method for active feature acquisition and its application to protein-protein interaction prediction
- Author
-
Thahir Mohamed, Sharma Tarun, and Ganapathiraju Madhavi K
- Subjects
Medicine ,Science - Abstract
Abstract Background Machine learning approaches for classification learn the pattern of the feature space of different classes, or learn a boundary that separates the feature space into different classes. The features of the data instances are usually available, and it is only the class-labels of the instances that are unavailable. For example, to classify text documents into different topic categories, the words in the documents are features and they are readily available, whereas the topic is what is predicted. However, in some domains obtaining features may be resource-intensive because of which not all features may be available. An example is that of protein-protein interaction prediction, where not only are the labels ('interacting' or 'non-interacting') unavailable, but so are some of the features. It may be possible to obtain at least some of the missing features by carrying out a few experiments as permitted by the available resources. If only a few experiments can be carried out to acquire missing features, which proteins should be studied and which features of those proteins should be determined? From the perspective of machine learning for PPI prediction, it would be desirable that those features be acquired which when used in training the classifier, the accuracy of the classifier is improved the most. That is, the utility of the feature-acquisition is measured in terms of how much acquired features contribute to improving the accuracy of the classifier. Active feature acquisition (AFA) is a strategy to preselect such instance-feature combinations (i.e. protein and experiment combinations) for maximum utility. The goal of AFA is the creation of optimal training set that would result in the best classifier, and not in determining the best classification model itself. Results We present a heuristic method for active feature acquisition to calculate the utility of acquiring a missing feature. This heuristic takes into account the change in belief of the classification model induced by the acquisition of the feature under consideration. As compared to random selection of proteins on which the experiments are performed and the type of experiment that is performed, the heuristic method reduces the number of experiments to as few as 40%. Most notable characteristic of this method is that it does not require re-training of the classification model on every possible combination of instance, feature and feature-value tuples. For this reason, our method is far less computationally expensive as compared with previous AFA strategies. Conclusions The results show that our heuristic method for AFA creates an optimal training set with far less features acquired as compared to random acquisition. This shows the value of active feature acquisition to aid in protein-protein interaction prediction where feature acquisition is costly. Compared to previous methods, the proposed method reduces computational cost while also achieving a better F-score. The proposed method is valuable as it presents a direction to AFA with a far lesser computational expense by removing the need for the first time, of training a classifier for every combination of instance, feature and feature-value tuples which would be impractical for several domains.
- Published
- 2012
- Full Text
- View/download PDF
8. Active learning for human protein-protein interaction prediction
- Author
-
Carbonell Jaime G, Mohamed Thahir P, and Ganapathiraju Madhavi K
- Subjects
Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background Biological processes in cells are carried out by means of protein-protein interactions. Determining whether a pair of proteins interacts by wet-lab experiments is resource-intensive; only about 38,000 interactions, out of a few hundred thousand expected interactions, are known today. Active machine learning can guide the selection of pairs of proteins for future experimental characterization in order to accelerate accurate prediction of the human protein interactome. Results Random forest (RF) has previously been shown to be effective for predicting protein-protein interactions. Here, four different active learning algorithms have been devised for selection of protein pairs to be used to train the RF. With labels of as few as 500 protein-pairs selected using any of the four active learning methods described here, the classifier achieved a higher F-score (harmonic mean of Precision and Recall) than with 3000 randomly chosen protein-pairs. F-score of predicted interactions is shown to increase by about 15% with active learning in comparison to that with random selection of data. Conclusion Active learning algorithms enable learning more accurate classifiers with much lesser labelled data and prove to be useful in applications where manual annotation of data is formidable. Active learning techniques demonstrated here can also be applied to other proteomics applications such as protein structure prediction and classification.
- Published
- 2010
- Full Text
- View/download PDF
9. N-gram analysis of 970 microbial organisms reveals presence of biological language models
- Author
-
Ganapathiraju Madhavi K and Osmanbeyoglu Hatice
- Subjects
Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background It has been suggested previously that genome and proteome sequences show characteristics typical of natural-language texts such as "signature-style" word usage indicative of authors or topics, and that the algorithms originally developed for natural language processing may therefore be applied to genome sequences to draw biologically relevant conclusions. Following this approach of 'biological language modeling', statistical n-gram analysis has been applied for comparative analysis of whole proteome sequences of 44 organisms. It has been shown that a few particular amino acid n-grams are found in abundance in one organism but occurring very rarely in other organisms, thereby serving as genome signatures. At that time proteomes of only 44 organisms were available, thereby limiting the generalization of this hypothesis. Today nearly 1,000 genome sequences and corresponding translated sequences are available, making it feasible to test the existence of biological language models over the evolutionary tree. Results We studied whole proteome sequences of 970 microbial organisms using n-gram frequencies and cross-perplexity employing the Biological Language Modeling Toolkit and Patternix Revelio toolkit. Genus-specific signatures were observed even in a simple unigram distribution. By taking statistical n-gram model of one organism as reference and computing cross-perplexity of all other microbial proteomes with it, cross-perplexity was found to be predictive of branch distance of the phylogenetic tree. For example, a 4-gram model from proteome of Shigellae flexneri 2a, which belongs to the Gammaproteobacteria class showed a self-perplexity of 15.34 while the cross-perplexity of other organisms was in the range of 15.59 to 29.5 and was proportional to their branching distance in the evolutionary tree from S. flexneri. The organisms of this genus, which happen to be pathotypes of E.coli, also have the closest perplexity values with E. coli. Conclusion Whole proteome sequences of microbial organisms have been shown to contain particular n-gram sequences in abundance in one organism but occurring very rarely in other organisms, thereby serving as proteome signatures. Further it has also been shown that perplexity, a statistical measure of similarity of n-gram composition, can be used to predict evolutionary distance within a genus in the phylogenetic tree.
- Published
- 2011
- Full Text
- View/download PDF
10. Mitotic block and epigenetic repression underlie neurodevelopmental defects and neurobehavioral deficits in congenital heart disease.
- Author
-
Gabriel GC, Yagi H, Tan T, Bais A, Glennon BJ, Stapleton MC, Huang L, Reynolds WT, Shaffer MG, Ganapathiraju M, Simon D, Panigrahy A, Wu YL, and Lo CW
- Subjects
- Animals, Mice, Neurogenesis genetics, Neurodevelopmental Disorders genetics, Neurodevelopmental Disorders etiology, Heart Defects, Congenital genetics, Microcephaly genetics, Male, Humans, Mitosis genetics, DNA Methylation, Female, Mutation, Autistic Disorder genetics, Apoptosis genetics, Mice, Inbred C57BL, Epigenesis, Genetic, Disease Models, Animal
- Abstract
Hypoplastic left heart syndrome (HLHS) is a severe congenital heart disease associated with microcephaly and poor neurodevelopmental outcomes. Here we show that the Ohia HLHS mouse model, with mutations in Sap130, a chromatin modifier, and Pcdha9, a cell adhesion protein, also exhibits microcephaly associated with mitotic block and increased apoptosis leading to impaired cortical neurogenesis. Transcriptome profiling, DNA methylation, and Sap130 ChIPseq analyses all demonstrate dysregulation of genes associated with autism and cognitive impairment. This includes perturbation of REST transcriptional regulation of neurogenesis, disruption of CREB signaling regulating synaptic plasticity, and defects in neurovascular coupling mediating cerebral blood flow. Adult mice harboring either the Pcdha9 mutation, which show normal brain anatomy, or forebrain-specific Sap130 deletion via Emx1-Cre, which show microcephaly, both demonstrate learning and memory deficits and autism-like behavior. These findings provide mechanistic insights indicating the adverse neurodevelopment in HLHS may involve cell autonomous/nonautonomous defects and epigenetic dysregulation., Competing Interests: Competing interests: The authors declare no competing interests., (© 2025. The Author(s).)
- Published
- 2025
- Full Text
- View/download PDF
11. The Role of Cilia and the Complex Genetics of Congenital Heart Disease.
- Author
-
Gabriel GC, Ganapathiraju M, and Lo CW
- Subjects
- Humans, Animals, Mice, Mutation, Signal Transduction, Protein Interaction Maps, Cilia pathology, Cilia genetics, Cilia metabolism, Heart Defects, Congenital genetics, Heart Defects, Congenital pathology
- Abstract
Congenital heart disease (CHD) can affect up to 1% of live births, and despite abundant evidence of a genetic etiology, the genetic landscape of CHD is still not well understood. A large-scale mouse chemical mutagenesis screen for mutations causing CHD yielded a preponderance of cilia-related genes, pointing to a central role for cilia in CHD pathogenesis. The genes uncovered by the screen included genes that regulate ciliogenesis and cilia-transduced cell signaling as well as many that mediate endocytic trafficking, a cell process critical for both ciliogenesis and cell signaling. The clinical relevance of these findings is supported by whole-exome sequencing analysis of CHD patients that showed enrichment for pathogenic variants in ciliome genes. Surprisingly, among the ciliome CHD genes recovered were many that encoded direct protein-protein interactors. Assembly of the CHD genes into a protein-protein interaction network yielded a tight interactome that suggested this protein-protein interaction may have functional importance and that its disruption could contribute to the pathogenesis of CHD. In light of these and other findings, we propose that an interactome enriched for ciliome genes may provide the genomic context for the complex genetics of CHD and its often-observed incomplete penetrance and variable expressivity.
- Published
- 2024
- Full Text
- View/download PDF
12. Mitotic Block and Epigenetic Repression Underlie Neurodevelopmental Defects and Neurobehavioral Deficits in Congenital Heart Disease.
- Author
-
Gabriel GC, Yagi H, Tan T, Bais AS, Glennon BJ, Stapleton MC, Huang L, Reynolds WT, Shaffer MG, Ganapathiraju M, Simon D, Panigrahy A, Wu YL, and Lo CW
- Abstract
Poor neurodevelopment is often observed with congenital heart disease (CHD), especially with mutations in chromatin modifiers. Here analysis of mice with hypoplastic left heart syndrome (HLHS) arising from mutations in Sin3A associated chromatin modifier Sap130 , and adhesion protein Pcdha9, revealed neurodevelopmental and neurobehavioral deficits reminiscent of those in HLHS patients. Microcephaly was associated with impaired cortical neurogenesis, mitotic block, and increased apoptosis. Transcriptional profiling indicated dysregulated neurogenesis by REST, altered CREB signaling regulating memory and synaptic plasticity, and impaired neurovascular coupling modulating cerebral blood flow. Many neurodevelopmental/neurobehavioral disease pathways were recovered, including autism and cognitive impairment. These same pathways emerged from genome-wide DNA methylation and Sap130 chromatin immunoprecipitation sequencing analyses, suggesting epigenetic perturbation. Mice with Pcdha9 mutation or forebrain-specific Sap130 deletion without CHD showed learning/memory deficits and autism-like behavior. These novel findings provide mechanistic insights indicating the adverse neurodevelopment in HLHS may involve cell autonomous/nonautonomous defects and epigenetic dysregulation and suggest new avenues for therapy.
- Published
- 2024
- Full Text
- View/download PDF
13. BEE FIRST: A standardized point-of-care ultrasound approach to a patient with dyspnea.
- Author
-
Ganapathiraju M, Paulson CL, Greenberg MR, and Roth KR
- Abstract
Dyspnea is a common complaint in patients who present to the emergency department and can be due to numerous etiologies. This case report details a 90-year-old female with a history significant for hypertension, hyperlipidemia, and new diagnosis of ovarian malignancy whose symptoms increased over the past three days. Point-of-care Ultrasonography showed multiple B-lines, a plethoric IVC without respiratory variation, a markedly low EF and a lack of RV dilation. There was also no evidence of effusion which led the emergency medicine team to the diagnosis of acute decompensated heart failure. This quick diagnosis was possible due to using the standardized POCUS approach guided by the BEE FIRST algorithm. BEE FIRST can help physicians remember: B -lines are indicative of interstitial thickening, E ffusion such as pericardial or pleural should be checked for, E jection F raction is useful in assessing for heart failure , I VC /I nfection/ I nfarct correlates with central venous pressure, and can be used to assess volume status, check for enlargement, evidence of pneumonia, subpleural consolidation "shred sign", hepatization of lung, and/or pulmonary infarction related to pulmonary embolism, R ight Heart Strain can indicate pulmonary embolism or pulmonary hypertension, S liding Lung can assess for pneumothorax and pleural characteristics, and lastly, T hrombosis/ T umor can assess for myxoma and interrogation of lower extremities for deep vein thrombosis can aid in dyspnea differentiation. In this report, we demonstrate how the framework BEE FIRST offers a standardized stepwise approach to the utilization of POCUS in a patient with acute dyspnea in the ED setting., (© 2022 The Authors. Published by Elsevier Inc. on behalf of University of Washington.)
- Published
- 2022
- Full Text
- View/download PDF
14. Potentially repurposable drugs for COVID-19 identified from SARS-CoV-2 Host Protein Interactome.
- Author
-
Karunakaran KB, Balakrishnan N, and Ganapathiraju M
- Abstract
We previously presented the protein-protein interaction network - the 'HoP' or the host protein interactome - of 332 host proteins that were identified to interact with 27 nCoV19 viral proteins by Gordon et al. Here, we studied drugs targeting the proteins in this interactome to identify whether any of them may potentially be repurposable against SARS-CoV-2. We studied each of the drugs using the BaseSpace Correlation Engine and identified those that induce gene expression profiles negatively correlated with SARS-associated expression profile. This analysis resulted in 20 drugs whose differential gene expression (drug versus normal) had an anti-correlation with differential expression for SARS (viral infection versus normal). These included drugs that were already being tested for their clinical activity against SARS-CoV-2, those with proven activity against SARS-CoV/MERS-CoV, broad-spectrum antiviral drugs, and those identified/prioritized by other computational re-purposing studies. In summary, our integrated computational analysis of the HoP interactome in conjunction with drug-induced transcriptomic data resulted in drugs that may be repurposable for COVID-19., Competing Interests: Competing interests: The authors declare no competing interests.
- Published
- 2020
- Full Text
- View/download PDF
15. Schizophrenia interactome: fully-labeled interactome network.
- Author
-
Ganapathiraju M and Chaparala S
- Published
- 2016
- Full Text
- View/download PDF
16. Mycobacterium tuberculosis and Clostridium difficille interactomes: demonstration of rapid development of computational system for bacterial interactome prediction.
- Author
-
Ananthasubramanian S, Metri R, Khetan A, Gupta A, Handen A, Chandra N, and Ganapathiraju M
- Abstract
Background: Protein-protein interaction (PPI) networks (interactomes) of most organisms, except for some model organisms, are largely unknown. Experimental methods including high-throughput techniques are highly resource intensive. Therefore, computational discovery of PPIs can accelerate biological discovery by presenting "most-promising" pairs of proteins that are likely to interact. For many bacteria, genome sequence, and thereby genomic context of proteomes, is readily available; additionally, for some of these proteomes, localization and functional annotations are also available, but interactomes are not available. We present here a method for rapid development of computational system to predict interactome of bacterial proteomes. While other studies have presented methods to transfer interologs across species, here, we propose transfer of computational models to benefit from cross-species annotations, thereby predicting many more novel interactions even in the absence of interologs. Mycobacterium tuberculosis (Mtb) and Clostridium difficile (CD) have been used to demonstrate the work., Results: We developed a random forest classifier over features derived from Gene Ontology annotations and genetic context scores provided by STRING database for predicting Mtb and CD interactions independently. The Mtb classifier gave a precision of 94% and a recall of 23% on a held out test set. The Mtb model was then run on all the 8 million protein pairs of the Mtb proteome, resulting in 708 new interactions (at 94% expected precision) or 1,595 new interactions at 80% expected precision. The CD classifier gave a precision of 90% and a recall of 16% on a held out test set. The CD model was run on all the 8 million protein pairs of the CD proteome, resulting in 143 new interactions (at 90% expected precision) or 580 new interactions (at 80% expected precision). We also compared the overlap of predictions of our method with STRING database interactions for CD and Mtb and also with interactions identified recently by a bacterial 2-hybrid system for Mtb. To demonstrate the utility of transfer of computational models, we made use of the developed Mtb model and used it to predict CD protein-pairs. The cross species model thus developed yielded a precision of 88% at a recall of 8%. To demonstrate transfer of features from other organisms in the absence of feature-based and interaction-based information, we transferred missing feature values from Mtb orthologs into the CD data. In transferring this data from orthologs (not interologs), we showed that a large number of interactions can be predicted., Conclusions: Rapid discovery of (partial) bacterial interactome can be made by using existing set of GO and STRING features associated with the organisms. We can make use of cross-species interactome development, when there are not even sufficient known interactions to develop a computational prediction system. Computational model of well-studied organism(s) can be employed to make the initial interactome prediction for the target organism. We have also demonstrated successfully, that annotations can be transferred from orthologs in well-studied organisms enabling accurate predictions for organisms with no annotations. These approaches can serve as building blocks to address the challenges associated with feature coverage, missing interactions towards rapid interactome discovery for bacterial organisms., Availability: The predictions for all Mtb and CD proteins are made available at: http://severus.dbmi.pitt.edu/TB and http://severus.dbmi.pitt.edu/CD respectively for browsing as well as for download.
- Published
- 2012
- Full Text
- View/download PDF
17. Transmembrane helix prediction using amino acid property features and latent semantic analysis.
- Author
-
Ganapathiraju M, Balakrishnan N, Reddy R, and Klein-Seetharaman J
- Subjects
- Amino Acid Sequence, Computer Simulation, Molecular Sequence Data, Protein Conformation, Semantics, Algorithms, Artificial Intelligence, Membrane Proteins chemistry, Models, Chemical, Models, Molecular, Sequence Analysis, Protein methods
- Abstract
Background: Prediction of transmembrane (TM) helices by statistical methods suffers from lack of sufficient training data. Current best methods use hundreds or even thousands of free parameters in their models which are tuned to fit the little data available for training. Further, they are often restricted to the generally accepted topology "cytoplasmic-transmembrane-extracellular" and cannot adapt to membrane proteins that do not conform to this topology. Recent crystal structures of channel proteins have revealed novel architectures showing that the above topology may not be as universal as previously believed. Thus, there is a need for methods that can better predict TM helices even in novel topologies and families., Results: Here, we describe a new method "TMpro" to predict TM helices with high accuracy. To avoid overfitting to existing topologies, we have collapsed cytoplasmic and extracellular labels to a single state, non-TM. TMpro is a binary classifier which predicts TM or non-TM using multiple amino acid properties (charge, polarity, aromaticity, size and electronic properties) as features. The features are extracted from sequence information by applying the framework used for latent semantic analysis of text documents and are input to neural networks that learn the distinction between TM and non-TM segments. The model uses only 25 free parameters. In benchmark analysis TMpro achieves 95% segment F-score corresponding to 50% reduction in error rate compared to the best methods not requiring an evolutionary profile of a protein to be known. Performance is also improved when applied to more recent and larger high resolution datasets PDBTM and MPtopo. TMpro predictions in membrane proteins with unusual or disputed TM structure (K+ channel, aquaporin and HIV envelope glycoprotein) are discussed., Conclusion: TMpro uses very few free parameters in modeling TM segments as opposed to the very large number of free parameters used in state-of-the-art membrane prediction methods, yet achieves very high segment accuracies. This is highly advantageous considering that high resolution transmembrane information is available only for very few proteins. The greatest impact of TMpro is therefore expected in the prediction of TM segments in proteins with novel topologies. Further, the paper introduces a novel method of extracting features from protein sequence, namely that of latent semantic analysis model. The success of this approach in the current context suggests that it can find potential applications in other sequence-based analysis problems., Availability: http://linzer.blm.cs.cmu.edu/tmpro/ and http://flan.blm.cs.cmu.edu/tmpro/
- Published
- 2008
- Full Text
- View/download PDF
18. TMpro web server and web service: transmembrane helix prediction through amino acid property analysis.
- Author
-
Ganapathiraju M, Jursa CJ, Karimi HA, and Klein-Seetharaman J
- Subjects
- Computer Simulation, Protein Conformation, Internet, Membrane Proteins chemistry, Membrane Proteins ultrastructure, Models, Chemical, Models, Molecular, Sequence Analysis, Protein methods, Software, User-Computer Interface
- Abstract
Unlabelled: TMpro is a transmembrane (TM) helix prediction algorithm that uses language processing methodology for TM segment identification. It is primarily based on the analysis of statistical distributions of properties of amino acids in transmembrane segments. This article describes the availability of TMpro on the internet via a web interface. The key features of the interface are: (i) output is generated in multiple formats including a user-interactive graphical chart which allows comparison of TMpro predicted segment locations with other labeled segments input by the user, such as predictions from other methods. (ii) Up to 5000 sequences can be submitted at a time for prediction. (iii) TMpro is available as a web server and is published as a web service so that the method can be accessed by users as well as other services depending on the need for data integration., Availability: http://linzer.blm.cs.cmu.edu/tmpro/ (web server and help), http://blm.sis.pitt.edu:8080/axis/services/TMProFetcherService (web service).
- Published
- 2007
- Full Text
- View/download PDF
19. Evolutionary insights from suffix array-based genome sequence analysis.
- Author
-
Poddar A, Chandra N, Ganapathiraju M, Sekar K, Klein-Seetharaman J, Reddy R, and Balakrishnan N
- Subjects
- Algorithms, Animals, Computational Biology, Mycobacterium tuberculosis genetics, Oligopeptides genetics, Software, Evolution, Molecular, Genome, Oligonucleotide Array Sequence Analysis, Protein Array Analysis, Sequence Analysis, DNA, Sequence Analysis, Protein
- Abstract
Gene and protein sequence analyses, central components of studies in modern biology are easily amenable to string matching and pattern recognition algorithms. The growing need of analysing whole genome sequences more efficiently and thoroughly, has led to the emergence of new computational methods. Suffix trees and suffix arrays are data structures, well known in many other areas and are highly suited for sequence analysis too. Here we report an improvement to the design of construction of suffix arrays. Enhancement in versatility and scalability, enabled by this approach, is demonstrated through the use of real-life examples. The scalability of the algorithm to whole genomes renders it suitable to address many biologically interesting problems. One example is the evolutionary insight gained by analysing unigrams, bi-grams and higher n-grams, indicating that the genetic code has a direct influence on the overall composition of the genome. Further, different proteomes have been analysed for the coverage of the possible peptide space, which indicate that as much as a quarter of the total space at the tetra-peptide level is left un-sampled in prokaryotic organisms, although almost all tri-peptides can be seen in one protein or another in a proteome. Besides, distinct patterns begin to emerge for the counts of particular tetra and higher peptides, indicative of a 'meaning' for tetra and higher n-grams. The toolkit has also been used to demonstrate the usefulness of identifying repeats in whole proteomes efficiently. As an example, 16 members of one COG,coded by the genome of Mycobacterium tuberculosis H37Rv have been found to contain a repeating sequence of 300 amino acids.
- Published
- 2007
- Full Text
- View/download PDF
20. Comparison of stability predictions and simulated unfolding of rhodopsin structures.
- Author
-
Tastan O, Yu E, Ganapathiraju M, Aref A, Rader AJ, and Klein-Seetharaman J
- Subjects
- Algorithms, Amino Acid Sequence, Animals, Bacteriorhodopsins chemistry, Bacteriorhodopsins genetics, Drug Stability, Halorhodopsins chemistry, Halorhodopsins genetics, Humans, In Vitro Techniques, Models, Molecular, Molecular Sequence Data, Photochemistry, Point Mutation, Protein Denaturation, Protein Folding, Rhodopsin genetics, Rhodopsin chemistry
- Abstract
Developing a better mechanistic understanding of membrane protein folding is urgently needed because of the discovery of an increasing number of human diseases, where membrane protein instability and misfolding is involved. Towards this goal, we investigated folding and stability of 7-transmembrane (TM) helical bundles by computational methods. We compared the results of three different algorithms for predicting changes in stability of proteins against an experimental mutation dataset obtained for bacteriorhodopsin (BR) and mammalian rhodopsin and find that 61.6% and 70.6% of the mutation results can potentially be explained by known local contributors to the stability of the folded state of BR and mammalian rhodopsin, respectively. To obtain further information on the predicted folding pathway of 7-TM proteins, we conducted simulated thermal unfolding experiments of all available rhodopsin structures with resolution better than 3 angstroms using the Floppy Inclusions and Rigid Substructure Topography (FIRST) method (Jacobs, D. J., A. J. Rader, L. A. Kuhn and M. F. Thorpe [2001] Proteins 44, 150) described previously for a single mammalian rhodopsin structure (Rader et al. [2004] PNAS 101, 7246). In statistical comparison we found that structures of mammalian rhodopsin have a stability core that is characterized by long-range interactions involving amino acids close in space but distant in sequence comprising positions from both extracellular loop and TM regions. In contrast, BR-simulated unfolding does not reveal such a core but is dominated by interactions within individual and groups of TM helices, consistent with the two-stage hypothesis of membrane protein folding. Similar results were obtained for halo- and sensory rhodopsins as for BRs. However, the average folding core energies of sensory rhodopsins were in between those observed for mammalian rhodopsins and BRs hinting at a possible evolution of these structures toward a rhodopsin-like behavior. These results support the conclusion that although the two-stage model can explain the mechanisms of folding and stability of BR, it fails to account for the folding and stability of mammalian rhodopsin, even though the two proteins are structurally related.
- Published
- 2007
- Full Text
- View/download PDF
21. Retinitis pigmentosa associated with rhodopsin mutations: Correlation between phenotypic variability and molecular effects.
- Author
-
Iannaccone A, Man D, Waseem N, Jennings BJ, Ganapathiraju M, Gallaher K, Reese E, Bhattacharya SS, and Klein-Seetharaman J
- Subjects
- Adolescent, Adult, Age Factors, Amino Acid Substitution, Child, Child, Preschool, Computational Biology, DNA Mutational Analysis, Disease Progression, Electroretinography, Female, Humans, Male, Pedigree, Peptide Fragments genetics, Phenotype, Retinitis Pigmentosa metabolism, Rhodopsin metabolism, Rod Cell Outer Segment metabolism, Vision, Ocular, Mutation, Retinitis Pigmentosa genetics, Rhodopsin genetics
- Abstract
Similar retinitis pigmentosa (RP) phenotypes can result from mutations affecting different rhodopsin regions, and distinct amino acid substitutions can cause different RP severity and progression rates. Specifically, both the R135L and R135W mutations (cytoplasmic end of H3) result in diffuse, severe disease (class A), but R135W causes more severe and more rapidly progressive RP than R135L. The P180A and G188R mutations (second intradiscal loop) exhibit a mild phenotype with regional variability (class B1) and diffuse disease of moderate severity (class B2), respectively. Computational and in vitro studies of these mutants provide molecular insights into this phenotypic variability.
- Published
- 2006
- Full Text
- View/download PDF
22. BLMT: statistical sequence analysis using N-grams.
- Author
-
Ganapathiraju M, Manoharan V, and Klein-Seetharaman J
- Subjects
- Computer Graphics, Algorithms, Data Interpretation, Statistical, Sequence Alignment methods, Sequence Analysis methods, Software, User-Computer Interface
- Abstract
Unlabelled: Statistical analysis of amino acid and nucleotide sequences, especially sequence alignment, is one of the most commonly performed tasks in modern molecular biology. However, for many tasks in bioinformatics, the requirement for the features in an alignment to be consecutive is restrictive and "n-grams" (aka k-tuples) have been used as features instead. N-grams are usually short nucleotide or amino acid sequences of length n, but the unit for a gram may be chosen arbitrarily. The n-gram concept is borrowed from language technologies where n-grams of words form the fundamental units in statistical language models. Despite the demonstrated utility of n-gram statistics for the biology domain, there is currently no publicly accessible generic tool for the efficient calculation of such statistics. Most sequence analysis tools will disregard matches because of the lack of statistical significance in finding short sequences. This article presents the integrated Biological Language Modeling Toolkit (BLMT) that allows efficient calculation of n-gram statistics for arbitrary sequence datasets., Availability: BLMT can be downloaded from http://www.cs.cmu.edu/~blmt/source and installed for standalone use on any Unix platform or Unix shell emulation such as Cygwin on the Windows platform. Specific tools and usage details are described in a "readme" file. The n-gram computations carried out by the BLMT are part of a broader set of tools borrowed from language technologies and modified for statistical analysis of biological sequences; these are available at http://flan.blm.cs.cmu.edu/.
- Published
- 2004
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.