3,445 results on '"Structural Bioinformatics"'
Search Results
2. Studying protein–protein interactions: Latest and most popular approaches
- Author
-
Akbarzadeh, Sama, Coşkun, Özlem, and Günçer, Başak
- Published
- 2024
- Full Text
- View/download PDF
3. Protein structural context of cancer mutations reveals molecular mechanisms and candidate driver genes
- Author
-
Chillón-Pino, Diego, Badonyi, Mihaly, Semple, Colin A., and Marsh, Joseph A.
- Published
- 2024
- Full Text
- View/download PDF
4. PyPropel: a Python-based tool for efficiently processing and characterising protein data.
- Author
-
Sun, Jianfeng, Ru, Jinlong, Cribbs, Adam P., and Xiong, Dapeng
- Abstract
Background: The volume of protein sequence data has grown exponentially in recent years, driven by advancements in metagenomics. Despite this, a substantial proportion of these sequences remain poorly annotated, underscoring the need for robust bioinformatics tools to facilitate efficient characterisation and annotation for functional studies. Results: We present PyPropel, a Python-based computational tool developed to streamline the large-scale analysis of protein data, with a particular focus on applications in machine learning. PyPropel integrates sequence and structural data pre-processing, feature generation, and post-processing for model performance evaluation and visualisation, offering a comprehensive solution for handling complex protein datasets. Conclusion: PyPropel provides added value over existing tools by offering a unified workflow that encompasses the full spectrum of protein research, from raw data pre-processing to functional annotation and model performance analysis, thereby supporting efficient protein function studies. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
5. Porter 6: Protein Secondary Structure Prediction by Leveraging Pre-Trained Language Models (PLMs).
- Author
-
Alanazi, Wafa, Meng, Di, and Pollastri, Gianluca
- Subjects
- *
PROTEIN structure prediction , *LANGUAGE models , *COMPUTATIONAL biology , *PROTEIN structure , *STRUCTURAL bioinformatics , *DEEP learning , *NATURAL language processing - Abstract
Accurately predicting protein secondary structure (PSSP) is crucial for understanding protein function, which is foundational to advancements in drug development, disease treatment, and biotechnology. Researchers gain critical insights into protein folding and function within cells by predicting protein secondary structures. The advent of deep learning models, capable of processing complex sequence data and identifying meaningful patterns, offer substantial potential to enhance the accuracy and efficiency of protein structure predictions. In particular, recent breakthroughs in deep learning—driven by the integration of natural language processing (NLP) algorithms—have significantly advanced the field of protein research. Inspired by the remarkable success of NLP techniques, this study harnesses the power of pre-trained language models (PLMs) to advance PSSP prediction. We conduct a comprehensive evaluation of various deep learning models trained on distinct sequence embeddings, including one-hot encoding and PLM-based approaches such as ProtTrans and ESM-2, to develop a cutting-edge prediction system optimized for accuracy and computational efficiency. Our proposed model, Porter 6, is an ensemble of CBRNN-based predictors, leveraging the protein language model ESM-2 as input features. Porter 6 achieves outstanding performance on large-scale, independent test sets. On a 2022 test set, the model attains an impressive 86.60% accuracy in three-state (Q3) and 76.43% in eight-state (Q8) classifications. When tested on a more recent 2024 test set, Porter 6 maintains robust performance, achieving 84.56% in Q3 and 74.18% in Q8 classifications. This represents a significant 3% improvement over its predecessor, outperforming or matching state-of-the-art approaches in the field. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
6. PaleAle 6.0: Prediction of Protein Relative Solvent Accessibility by Leveraging Pre-Trained Language Models (PLMs).
- Author
-
Alanazi, Wafa, Meng, Di, and Pollastri, Gianluca
- Subjects
- *
ARTIFICIAL neural networks , *PROTEIN structure prediction , *NATURAL language processing , *LANGUAGE models , *COMPUTATIONAL biology , *DEEP learning - Abstract
Predicting the relative solvent accessibility (RSA) of a protein is critical to understanding its 3D structure and biological function. RSA prediction, especially when homology transfer cannot provide information about a protein's structure, is a significant step toward addressing the protein structure prediction challenge. Today, deep learning is arguably the most powerful method for predicting RSA and other structural features of proteins. In particular, recent breakthroughs in deep learning—driven by the integration of natural language processing (NLP) algorithms—have significantly advanced the field of protein research. Inspired by the remarkable success of NLP techniques, this study leverages pre-trained language models (PLMs) to enhance RSA prediction. We present a deep neural network architecture based on a combination of bidirectional recurrent neural networks and convolutional layers that can analyze long-range interactions within protein sequences and predict protein RSA using ESM-2 encoding. The final predictor, PaleAle 6.0, predicts RSA in real values as well as two-state (exposure threshold of 25%) and four-state (exposure thresholds of 4%, 25%, and 50%) discrete classifications. On the 2022 test set dataset, PaleAle 6.0 achieved over 82% accuracy for two-state RSA (RSA_2C) and 59.75% accuracy for four-state RSA (RSA_4C), with a Pearson correlation coefficient (PCC) of 77.88 for real-value RSA prediction. When evaluated on the more challenging 2024 test set, PaleAle 6.0 maintained a strong performance, achieving 79.74% accuracy in the two-state prediction and 55.30% accuracy in the four-state prediction, with a PCC of 73.08 for real-value predictions, outperforming all previously benchmarked predictors. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
7. E2F1-induced autocrine IL-6 inflammatory loop mediates cancer-immune crosstalk that predicts T cell phenotype switching and therapeutic responsiveness.
- Author
-
Spitschak, Alf, Dhar, Prabir, Singh, Krishna P., Casalegno Garduño, Rosaely, Gupta, Shailendra K., Vera, Julio, Musella, Luca, Murr, Nico, Stoll, Anja, and Pützer, Brigitte M.
- Subjects
GENE regulatory networks ,T cells ,CELL communication ,EPITHELIAL-mesenchymal transition ,STRUCTURAL bioinformatics - Abstract
Melanoma is a metastatic, drug-refractory cancer with the ability to evade immunosurveillance. Cancer immune evasion involves interaction between tumor intrinsic properties and the microenvironment. The transcription factor E2F1 is a key driver of tumor evolution and metastasis. To explore E2F1's role in immune regulation in presence of aggressive melanoma cells, we established a coculture system and utilized transcriptome and cytokine arrays combined with bioinformatics and structural modeling. We identified an E2F1-dependent gene regulatory network with IL6 as a central hub. E2F1-induced IL-6 secretion unleashes an autocrine inflammatory feedback loop driving invasiveness and epithelial-to-mesenchymal transition. IL-6-activated STAT3 physically interacts with E2F1 and cooperatively enhances IL-6 expression by binding to an E2F1-STAT3-responsive promoter element. The E2F1-STAT3/IL-6 axis strongly modulates the immune niche and generates a crosstalk with CD4
+ cells resulting in transcriptional changes of immunoregulatory genes in melanoma and immune cells that is indicative of an inflammatory and immunosuppressive environment. Clinical data from TCGA demonstrated that elevated E2F1, STAT3, and IL-6 correlate with infiltration of Th2, while simultaneously blocking Th1 in primary and metastatic melanomas. Strikingly, E2F1 depletion reduces the secretion of typical type-2 cytokines thereby launching a Th2-to-Th1 phenotype shift towards an antitumor immune response. The impact of activated E2F1-STAT3/IL-6 axis on melanoma-immune cell communication and its prognostic/therapeutic value was validated by mathematical modeling. This study addresses important molecular aspects of the tumor-associated microenvironment in modulating immune responses, and will contribute significantly to the improvement of future cancer therapies. [ABSTRACT FROM AUTHOR]- Published
- 2025
- Full Text
- View/download PDF
8. Surface exposed and charged residues drive thermostability in fungi.
- Author
-
Senthilkumar, Shricharan, Mahesh, Sankar, Jaisankar, Subachandran, and Yennamalli, Ragothaman M.
- Abstract
Fungi, though mesophilic, include thermophilic and thermostable species, as well. The thermostability of proteins observed in these fungi is most likely to be attributed to several molecular factors, such as the presence of salt bridges and hydrogen bond interactions between side chains. These factors cannot be generalized for all fungi. Factors impacting thermostability can guide how fungal thermophilic proteins gain thermostability. We curated a dataset of proteins for 14 thermophilic fungi and their evolutionarily closer mesophiles. Additionally, the proteome of Chaetomium thermophilum and its evolutionarily related mesophile Chaetomium globosum was analyzed. Using eggNOG, we categorized the proteomes into clusters of orthologous groups (COGs). While the individual count of proteins is over‐represented in mesophiles (for COGs S, G, L, and Q), there are certain features that are significantly enriched in thermophiles (such as charged residues, exposed residues, polar residues, etc.). Since fungi are known to be cellulolytic and chitinolytic by nature, we selected 37 existing carbohydrate‐active enzymes (CAZyme) families in Eurotiales, Mucorales, and Sordariales. We looked at closely similar sequences and their modeled structures for further comparison. Comparing solvent accessibilities of thermophilic and mesophilic proteins, exposed and intermediate residues are observed higher in thermophiles whereas buried residues are observed higher in mesophiles. For specific five CAZYme families (GH7, GH11, GH18, GH45, and CBM1) we looked at position‐specific substitutions between thermophiles and mesophiles. We also found that there are relatively more intramolecular interactions in thermophiles compared to mesophiles. Thus, we found factors such as surface exposed residues and charged residues that are highly likely to impart thermostability in fungi, and this study sets the stage for further studies in the area of fungal thermostability. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
9. Challenges in bridging the gap between protein structure prediction and functional interpretation.
- Author
-
Varadi, Mihaly, Tsenkov, Maxim, and Velankar, Sameer
- Abstract
The rapid evolution of protein structure prediction tools has significantly broadened access to protein structural data. Although predicted structure models have the potential to accelerate and impact fundamental and translational research significantly, it is essential to note that they are not validated and cannot be considered the ground truth. Thus, challenges persist, particularly in capturing protein dynamics, predicting multi‐chain structures, interpreting protein function, and assessing model quality. Interdisciplinary collaborations are crucial to overcoming these obstacles. Databases like the AlphaFold Protein Structure Database, the ESM Metagenomic Atlas, and initiatives like the 3D‐Beacons Network provide FAIR access to these data, enabling their interpretation and application across a broader scientific community. Whilst substantial advancements have been made in protein structure prediction, further progress is required to address the remaining challenges. Developing training materials, nurturing collaborations, and ensuring open data sharing will be paramount in this pursuit. The continued evolution of these tools and methodologies will deepen our understanding of protein function and accelerate disease pathogenesis and drug development discoveries. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
10. Analysis of Alzheimer’s disease associated deleterious non-synonymous single nucleotide polymorphisms and their impacts on protein structure and function by performing in-silico methods.
- Author
-
Akcesme, Betul, Islam, Nadia, Lekic, Delila, Cutuk, Raisa, and Basovic, Nejla
- Abstract
Alzheimer’s disease (AD) is a neurodegenerative disorder that is presented with a progressive loss of memory, a decline in cognitive abilities and multiple changes in behavior. Its pathogenicity has been linked to genetic factors in approximately 60–80% of the cases specifically APOE gene family and as well as other gene families. This study utilized advanced computational biology methods to analyze AD-associated nsSNPs extracted from the NHGRI-EBI GWAS Catalog. Ensembl Variant Effect Predictor (VEP) is used to annotate the variants associated with AD. Annotated missense variants were subjected to PolyPhen-2, SNPs&Go, PredictSNP servers which were used to predict pathogenicity of selected missense variants by protein sequence information. DynaMut and DUET servers were applied to determine protein stability due to the amino acid change by integrating protein structure information. Missense variations associated with AD were annotated to 26 proteins and further analyzed in our study. Following rigorous data filtration steps, 15 candidate variants (13 proteins) were identified and subjected to sequence and structure-based analysis. Finally in this in-silico study, five deleterious non-synonymous single nucleotide polymorphisms (nsSNPs) were identified in ACKR2(V41A), APOE(R176C), ATP8B4(G395S), LAMB2(E987K), and TOMM40(R239W), and these findings were subsequently backed-up by existing in-vivo and in-vitro literature. This study not only provides invaluable insight into the intricate pathogenic mechanisms underlying AD but also offers a distinctive perspective that paves the way for future, more comprehensive investigations aimed at unraveling the molecular intricacies responsible for the development and progression of AD. Nonetheless, it is imperative that further rigorous in vivo and in vitro experiments are conducted to validate and expand upon the findings presented here. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
11. Discovery of the bacterial HslV protease activators as lead molecules with novel mode of action
- Author
-
Aurangzeb Sana, Aurongzeb Muhammad, Shamim Shahbaz, Rashid Yasmeen, Khan Khalid Mohammed, Aziz Tariq, Alharbi Metab, and Alasmari Abdullah F.
- Subjects
the hslvu complex ,hslv protease activators ,structural bioinformatics ,nma analysis ,molecular docking studies ,Chemistry ,QD1-999 - Abstract
The HslVU enzyme complex, a proteasomal analog found in bacteria, consists of two components, i.e., the HslV protease and the HslU ATPase. These proteins come together to form a functional enzyme complex, where the C-terminal helix of each HslU subunit is inserted into the binding pocket of each HslV dimer. This interaction leads to the activation of the HslV protease through allosteric mechanisms, enabling its enzymatic function. This bacterial complex is reflected as an attractive target for drug development due to its presence in disease-causing microorganisms and concurrent absence in humans. The objective of this research was to identify certain promising drug candidates that could excessively stimulate the HslV protease, leading to uncontrolled protein breakdown in the pathogens. Four dihydropyrimidone derivatives have been identified as potential activators of HslV protease exhibiting high docking scores, favorable binding patterns, and significant in vitro activation capabilities. These compounds have demonstrated effective dose 50 values within the sub-micromolar range, i.e., 0.4–0.58 µM. Normal mode analysis investigations provided additional confirmation regarding the stability of the conformational interactions between the HslV protease and the active compounds. In addition, the predicted absorption, distribution, metabolism, excretion, and toxicity properties of these lead compounds remarkably demonstrated their considerable drug-like and non-toxic qualities. This study not only presents more potent small non-peptide activators of the HslV protease but also enhances the understanding regarding the mechanism of HslVU complex activation via small non-peptidic molecules.
- Published
- 2025
- Full Text
- View/download PDF
12. An outlook on structural biology after AlphaFold: tools, limits and perspectives
- Author
-
Serena Rosignoli, Maddalena Pacelli, Francesca Manganiello, and Alessandro Paiardini
- Subjects
AlphaFold ,machine learning ,structural bioinformatics ,structure prediction ,Biology (General) ,QH301-705.5 - Abstract
AlphaFold and similar groundbreaking, AI‐based tools, have revolutionized the field of structural bioinformatics, with their remarkable accuracy in ab‐initio protein structure prediction. This success has catalyzed the development of new software and pipelines aimed at incorporating AlphaFold's predictions, often focusing on addressing the algorithm's remaining challenges. Here, we present the current landscape of structural bioinformatics shaped by AlphaFold, and discuss how the field is dynamically responding to this revolution, with new software, methods, and pipelines. While the excitement around AI‐based tools led to their widespread application, it is essential to acknowledge that their practical success hinges on their integration into established protocols within structural bioinformatics, often neglected in the context of AI‐driven advancements. Indeed, user‐driven intervention is still as pivotal in the structure prediction process as in complementing state‐of‐the‐art algorithms with functional and biological knowledge.
- Published
- 2025
- Full Text
- View/download PDF
13. Prediction of Protein Secondary Structures Based on Substructural Descriptors of Molecular Fragments.
- Author
-
Zakharov, Oleg S., Rudik, Anastasia V., Filimonov, Dmitry A., and Lagunin, Alexey A.
- Subjects
- *
PROTEIN structure prediction , *MOLECULAR biology , *PROTEIN structure , *STRUCTURAL bioinformatics , *BANKING industry - Abstract
The accurate prediction of secondary structures of proteins (SSPs) is a critical challenge in molecular biology and structural bioinformatics. Despite recent advancements, this task remains complex and demands further exploration. This study presents a novel approach to SSP prediction using atom-centric substructural multilevel neighborhoods of atoms (MNA) descriptors for protein molecular fragments. A dataset comprising over 335,000 SSPs, annotated by the Dictionary of Secondary Structure in Proteins (DSSP) software from 37,000 proteins, was constructed from Protein Data Bank (PDB) records with a resolution of 2 Å or better. Protein fragments were converted into structural formulae using the RDKit Python package and stored in SD files using the MOL V3000 format. Classification sequence–structure–property relationships (SSPR) models were developed with varying levels of MNA descriptors and a Bayesian algorithm implemented in MultiPASS software. The average prediction accuracy (AUC) for eight SSP types, calculated via leave-one-out cross-validation, was 0.902. For independent test sets (ASTRAL and CB513 datasets), the best SSPR models achieved AUC, Q3, and Q8 values of 0.860, 77.32%, 70.92% and 0.889, 78.78%, 74.74%, respectively. Based on the created models, a freely available web application MNA-PSS-Pred was developed. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. 50 Years of Antibody Numbering Schemes: A Statistical and Structural Evaluation Reveals Key Differences and Limitations.
- Author
-
Zhu, Zirui, Olson, Katherine S., and Magliery, Thomas J.
- Subjects
- *
IMMUNOGLOBULIN light chains , *STRUCTURAL bioinformatics , *IMMUNE recognition , *IMMUNE response , *IMMUNOGLOBULINS - Abstract
Background: The complementarity-determining region (CDR) of antibodies represents the most diverse region both in terms of sequence and structural characteristics, playing the most critical role in antibody recognition and binding for immune responses. Over the past decades, several numbering schemes have been introduced to define CDRs based on sequence. However, the existence of diverse numbering schemes has led to potential confusion, and a comprehensive evaluation of these schemes is lacking. Methods: We employ statistical analyses to quantify the diversity of CDRs compared to the framework regions. Results: Comparative analyses across different numbering schemes demonstrate notable variations in CDR definitions. The Kabat and AbM numbering schemes tend to incorporate more conserved residues into their CDR definitions, whereas CDRs defined by the Chothia and IMGT numbering schemes display greater diversity, sometimes missing certain loop residues. Notably, we identify a critical residue, L29, within the kappa light chain CDR1, which appears to act as a pivotal structural point within the loop. In contrast, most numbering schemes designate the topological equivalent point in the lambda light chain as L30, suggesting the need for further refinement in the current numbering schemes. Conclusions: These findings shed light on regional sequence and structural conservation within antibody sequence databases while also highlighting discrepancies stemming from different numbering schemes. These insights yield valuable guidelines for the precise delineation of antibody CDRs and the strategic design of antibody repertoires, with practical implications in developing innovative antibody-based therapeutics and diagnostics. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. A self-adaptive evolutionary algorithm using Monte Carlo Fragment insertion and conformation clustering for the protein structure prediction problem.
- Author
-
Parpinelli, Rafael Stubs, Will, Nilcimar Neitzel, and da Silva, Renan Samuel
- Subjects
- *
PROTEIN structure prediction , *PROTEIN conformation , *EVOLUTIONARY algorithms , *PROTEIN structure , *STRUCTURAL bioinformatics - Abstract
The Protein Structure Prediction Problem is one of the most important and challenging open problems in Computer Science and Structural Bioinformatics. Accurately predicting protein conformations would significantly impact several fields, such as understanding proteinopathies and developing smart protein-based drugs. As such, this work has as its primary goal to improve the prediction power of ab initio methods by utilizing a self-adaptive evolutionary algorithm using Monte Carlo based fragment insertion and conformational clustering. A meta-heuristic is used as the core of the conformation sampling process with fragment insertion, feeding domain-specific information into the process. The online parameter control routines allow the method to adapt to a protein's structure specificity and behave dynamically in different stages of the optimization process. The results obtained by the proposed method were compared to results obtained from several other algorithms found in the literature. It is possible to conclude that the proposed method is highly competitive in terms of free-energy and RMSD for the protein set used in the experiments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Analyzing aptamer structure and interactions: in silico modelling and instrumental methods.
- Author
-
Malysheva, Daria O., Dymova, Maya A., and Richter, Vladimir A.
- Abstract
Aptamers are short oligonucleotides that bind specifically to various ligands and are characterized by their low immunogenicity, thermostability, and ease of labeling. Many biomedical applications of aptamers as biosensors and drug delivery agents are currently being actively researched. Selective affinity selection with exponential ligand enrichment (SELEX) allows to discover aptamers for a specific target, but it only provides information about the sequence of aptamers; hence other approaches are used for determining aptamer structure, aptamer-ligand interactions and the mechanism of action. The first one is in silico modelling that allows to infer likely secondary and tertiary structures and model their interactions with a ligand. The second approach is to use instrumental methods to study structure and aptamer-ligand interaction. In silico modelling and instrumental methods are complimentary and their combined use allows to eliminate some ambiguity in their respective results. This review examines both the advantages and limitations of in silico modelling and instrumental approaches currently used to study aptamers, which will allow researchers to develop optimal study designs for analyzing aptamer structure and ligand interactions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. First transcriptome analysis of the venom glands of the scorpion Hottentotta zagrosensis (Scorpions: Buthidae) with focus on venom lipolysis activating peptides.
- Author
-
Salabi, Fatemeh, Jafari, Hedieh, Mahdavinia, Masoud, Azadnasab, Reza, Shariati, Saeedeh, Baghal, Mahsa Lari, Tebianian, Majid, and Baradaran, Masoumeh
- Subjects
VENOM glands ,PEPTIDES ,ANTIMICROBIAL peptides ,STRUCTURAL bioinformatics ,ENZYME regulation ,VENOM ,SCORPION venom ,LIPOLYSIS - Abstract
Introduction: Scorpion venom is a rich source of biological active peptides and proteins. Transcriptome analysis of the venom gland provides detailed insights about peptide and protein venom components. Following the transcriptome analysis of different species in our previous studies, our research team has focused on the Hottentotta zagrosensis as one of the endemic scorpions of Iran to obtain information about its venom proteins, in order to develop biological research focusing on medicinal applications of scorpion venom components and antivenom production. To gain insights into the protein composition of this scorpion venom, we performed transcriptomic analysis. Methods: Transcriptomic analysis of the venom gland of H. zagrosensis, prepared from the Khuzestan province, was performed through Illumina paired-end sequencing (RNA-Seq), Trinity de novo assembly, CD-Hit-EST clustering, and annotation of identified primary structures using bioinformatics approaches. Results: Transcriptome analysis showed the presence of 96.4% of complete arthropod BUSCOs, indicating a high-quality assembly. From total of 45,795,108 paired-end 150 bp trimmed reads, the clustering step resulted in the generation of 101,180 de novo assembled transcripts with N
50 size of 1,149 bp. 96,071 Unigenes and 131,235 transcripts had a significant similarity (E-value 1e-3) with known proteins from UniProt, Swissprot, Animal toxin annotation project, and the Pfam database. The results were validated using InterProScan. These mainly correspond to ion channel inhibitors, metalloproteinases, neurotoxins, protease inhibitors, protease activators, Cysteine-rich secretory proteins, phospholipase A enzymes, antimicrobial peptides, growth factors, lipolysis-activating peptides, hyaluronidase, and, phospholipase D. Our venom gland transcriptomic approach identified several biologically active peptides including five LVP1-alpha and LVP1-beta isoforms, which we named HzLVP1_alpha1, HzLVP1_alpha2, HzLVP1_alpha3, HzLVP1_beta1, and HzLVP1_beta and have extremely characterized here. Discussion: Except for HzLVP1_beta1, all other identified LVP1s are predicted to be stable proteins (instability index <40). Moreover, all isoform of LVP1s alpha and beta subunits are thermostable, with the most stability for HzLVP1_alpha2 (aliphatic index = 71.38). HzLVP1_alpha2 has also the highest half-life. Three-dimensional structure of all identified proteins compacts with three disulfide bridges. The extra cysteine residue may allow the proteins to form a hetero- or homodimer. LVP1 subunits of H. zagrosensis potentially interact with adipose triglyceride lipase (ATGL) and hormone-sensitive lipase (HSL), two key enzymes in regulation of lipolysis in adipocytes, suggesting pharmacological properties of these identified proteins. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
18. E2F1-induced autocrine IL-6 inflammatory loop mediates cancer-immune crosstalk that predicts T cell phenotype switching and therapeutic responsiveness.
- Author
-
Spitschak, Alf, Dhar, Prabir, Singh, Krishna P., Garduño, Rosaely Casalegno, Gupta, Shailendra K., Vera, Julio, Musella, Luca, Murr, Nico, Stoll, Anja, and Pützer, Brigitte M.
- Subjects
GENE regulatory networks ,T cells ,CELL communication ,EPITHELIAL-mesenchymal transition ,STRUCTURAL bioinformatics - Abstract
Melanoma is a metastatic, drug-refractory cancer with the ability to evade immunosurveillance. Cancer immune evasion involves interaction between tumor intrinsic properties and the microenvironment. The transcription factor E2F1 is a key driver of tumor evolution and metastasis. To explore E2F1's role in immune regulation in presence of aggressive melanoma cells, we established a coculture system and utilized transcriptome and cytokine arrays combined with bioinformatics and structural modeling. We identified an E2F1-dependent gene regulatory network with IL6 as a central hub. E2F1-induced IL-6 secretion unleashes an autocrine inflammatory feedback loop driving invasiveness and epithelial-to-mesenchymal transition. IL-6-activated STAT3 physically interacts with E2F1 and cooperatively enhances IL-6 expression by binding to an E2F1-STAT3-responsive promoter element. The E2F1-STAT3/IL-6 axis strongly modulates the immune niche and generates a crosstalk with CD4
+ cells resulting in transcriptional changes of immunoregulatory genes in melanoma and immune cells that is indicative of an inflammatory and immunosuppressive environment. Clinical data from TCGA demonstrated that elevated E2F1, STAT3, and IL-6 correlate with infiltration of Th2, while simultaneously blocking Th1 in primary and metastatic melanomas. Strikingly, E2F1 depletion reduces the secretion of typical type-2 cytokines thereby launching a Th2-to-Th1 phenotype shift towards an antitumor immune response. The impact of activated E2F1-STAT3/IL-6 axis on melanoma-immune cell communication and its prognostic/therapeutic value was validated by mathematical modeling. This study addresses important molecular aspects of the tumor-associated microenvironment in modulating immune responses, and will contribute significantly to the improvement of future cancer therapies. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
19. deepBBQ: A Deep Learning Approach to the Protein Backbone Reconstruction.
- Author
-
Kryś, Justyna D., Głowacki, Maksymilian, Śmieja, Piotr, and Gront, Dominik
- Subjects
- *
CONVOLUTIONAL neural networks , *BIOINFORMATICS software , *PEPTIDES , *STRUCTURAL bioinformatics , *CARTESIAN coordinates - Abstract
Coarse-grained models have provided researchers with greatly improved computational efficiency in modeling structures and dynamics of biomacromolecules, but, to be practically useful, they need fast and accurate conversion methods back to the all-atom representation. Reconstruction of atomic details may also be required in the case of some experimental methods, like electron microscopy, which may provide C α -only structures. In this contribution, we present a new method for recovery of all backbone atom positions from just the C α coordinates. Our approach, called deepBBQ, uses a deep convolutional neural network to predict a single internal coordinate per peptide plate, based on C α trace geometric features, and then proceeds to recalculate the cartesian coordinates based on the assumption that the peptide plate atoms lie in the same plane. Extensive comparison with similar programs shows that our solution is accurate and cost-efficient. The deepBBQ program is available as part of the open-source bioinformatics toolkit Bioshell and is free for download and the documentation is available online. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Mut-Map: Comprehensive Computational Pipeline for Structural Mapping and Analysis of Cancer-Associated Mutations.
- Author
-
Alsulami, Ali F
- Subjects
- *
SOMATIC mutation , *BANKING industry , *GENETIC mutation , *STRUCTURAL bioinformatics , *PROTEIN structure - Abstract
Understanding the functional impact of genetic mutations on protein structures is essential for advancing cancer research and developing targeted therapies. The main challenge lies in accurately mapping these mutations to protein structures and analysing their effects on protein function. To address this, Mut-Map (https://genemutation.org/) is a comprehensive computational pipeline designed to integrate mutation data from the Catalogue Of Somatic Mutations In Cancer database with protein structural data from the Protein Data Bank and AlphaFold models. The pipeline begins by taking a UniProt ID and proceeds through mapping corresponding Protein Data Bank structures, renumbering residues, and assessing disorder percentages. It then overlays mutation data, categorizes mutations based on structural context, and visualizes them using advanced tools like MolStar. This approach allows for a detailed analysis of how mutations may disrupt protein function by affecting key regions such as DNA interfaces, ligand-binding sites, and dimer interactions. To validate the pipeline, a case study on the TP53 gene, a critical tumour suppressor often mutated in cancers, was conducted. The analysis highlighted the most frequent mutations occurring at the DNA-binding interface, providing insights into their potential role in cancer progression. Mut-Map offers a powerful resource for elucidating the structural implications of cancer-associated mutations, paving the way for more targeted therapeutic strategies and advancing our understanding of protein structure–function relationships. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Bioinformatics in Russia: history and present-day landscape.
- Author
-
Nawaz, Muhammad A, Pamirsky, Igor E, and Golokhvast, Kirill S
- Subjects
- *
BIOLOGICAL databases , *STRUCTURAL bioinformatics , *COMPUTATIONAL biology , *MOLECULAR biology , *NUCLEOTIDE sequencing - Abstract
Bioinformatics has become an interdisciplinary subject due to its universal role in molecular biology research. The current status of Russia's bioinformatics research in Russia is not known. Here, we review the history of bioinformatics in Russia, present the current landscape, and highlight future directions and challenges. Bioinformatics research in Russia is driven by four major industries: information technology, pharmaceuticals, biotechnology, and agriculture. Over the past three decades, despite a delayed start, the field has gained momentum, especially in protein and nucleic acid research. Dedicated and shared centers for genomics, proteomics, and bioinformatics are active in different regions of Russia. Present-day bioinformatics in Russia is characterized by research issues related to genetics, metagenomics, OMICs, medical informatics, computational biology, environmental informatics, and structural bioinformatics. Notable developments are in the fields of software (tools, algorithms, and pipelines), use of high computation power (e.g. by the Siberian Supercomputer Center), and large-scale sequencing projects (the sequencing of 100 000 human genomes). Government funding is increasing, policies are being changed, and a National Genomic Information Database is being established. An increased focus on eukaryotic genome sequencing, the development of a common place for developers and researchers to share tools and data, and the use of biological modeling, machine learning, and biostatistics are key areas for future focus. Universities and research institutes have started to implement bioinformatics modules. A critical mass of bioinformaticians is essential to catch up with the global pace in the discipline. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Atomistic simulations reveal impacts of missense mutations on the structure and function of SynGAP1.
- Author
-
Ali, Aliaa E, Li, Li-Li, Courtney, Michael J, Pentikäinen, Olli T, and Postila, Pekka A
- Subjects
- *
MISSENSE mutation , *DRUG discovery , *ALZHEIMER'S disease , *MOLECULAR dynamics , *MUTAGENESIS - Abstract
De novo mutations in the synaptic GTPase activating protein (SynGAP) are associated with neurological disorders like intellectual disability, epilepsy, and autism. SynGAP is also implicated in Alzheimer's disease and cancer. Although pathogenic variants are highly penetrant in neurodevelopmental conditions, a substantial number of them are caused by missense mutations that are difficult to diagnose. Hence, in silico mutagenesis was performed for probing the missense effects within the N-terminal region of SynGAP structure. Through extensive molecular dynamics simulations, encompassing three 150-ns replicates for 211 variants, the impact of missense mutations on the protein fold was assessed. The effect of the mutations on the folding stability was also quantitatively assessed using free energy calculations. The mutations were categorized as potentially pathogenic or benign based on their structural impacts. Finally, the study introduces wild-type-SynGAP in complex with RasGTPase at the inner membrane, while considering the potential effects of mutations on these key interactions. This study provides structural perspective to the clinical assessment of SynGAP missense variants and lays the foundation for future structure-based drug discovery. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Proteome structuring of crown-of-thorns starfish.
- Author
-
Yunchi Zhu and Zuhong Lu
- Subjects
LIFE sciences ,PROTEIN structure prediction ,TRANSMEMBRANE domains ,MEDICAL sciences ,LANGUAGE models - Abstract
The article "Proteome structuring of crown-of-thorns starfish" discusses the importance of understanding the genetic basis of the crown-of-thorns starfish (COTS) to develop biocontrol methods. Scientists have sequenced the COTS genome and are using AI-based protein structure prediction systems to enhance protein annotations. The study predicts 31,743 protein structures of COTS, providing insights into its biology and potential biocontrol strategies. The research aims to deepen our understanding of COTS biology and contribute to the protection of coral reefs. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
24. The fitness cost of spurious phosphorylation.
- Author
-
Bradley, David, Hogrebe, Alexander, Dandage, Rohan, Dubé, Alexandre K, Leutert, Mario, Dionne, Ugo, Chang, Alexis, Villén, Judit, and Landry, Christian R
- Subjects
- *
PROTEIN kinases , *PROTEIN-tyrosine kinases , *STRUCTURAL bioinformatics , *PHOSPHOTYROSINE , *CYTOSKELETAL proteins - Abstract
The fidelity of signal transduction requires the binding of regulatory molecules to their cognate targets. However, the crowded cell interior risks off-target interactions between proteins that are functionally unrelated. How such off-target interactions impact fitness is not generally known. Here, we use Saccharomyces cerevisiae to inducibly express tyrosine kinases. Because yeast lacks bona fide tyrosine kinases, the resulting tyrosine phosphorylation is biologically spurious. We engineered 44 yeast strains each expressing a tyrosine kinase, and quantitatively analysed their phosphoproteomes. This analysis resulted in ~30,000 phosphosites mapping to ~3500 proteins. The number of spurious pY sites generated correlates strongly with decreased growth, and we predict over 1000 pY events to be deleterious. However, we also find that many of the spurious pY sites have a negligible effect on fitness, possibly because of their low stoichiometry. This result is consistent with our evolutionary analyses demonstrating a lack of phosphotyrosine counter-selection in species with tyrosine kinases. Our results suggest that, alongside the risk for toxicity, the cell can tolerate a large degree of non-functional crosstalk as interaction networks evolve. Synopsis: How do off-target interactions between proteins inside crowded cells affect cellular fitness? Expression of human tyrosine kinases in yeast, which lacks general tyrosine phosphorylation, shows that many individual spurious pY sites have no impact, but also a strong negative correlation between phosphorylation and fitness. 20% of pY sites are predicted to destabilise the substrate. Native pS/pT signalling in yeast is likely dysregulated. Many Y sites are likely phosphorylated at low stoichiometry. Spurious phosphorylation is widespread and not restricted to homologs of native substrates. Evolutionary selection to avoid spurious Y phosphorylation appears unlikely. Expression of human tyrosine kinases in yeast shows that while there is a strong negative correlation between phosphorylation and fitness, many individual pY sites have negligible impact. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. MIPDS: a comprehensive database on the molecular interactions in protein dimer structures.
- Author
-
Sanjeevi, Madhumathi, Rajendran, Santhosh, Jeyaraman, Jeyakanthan, and Sekar, Kanagaraj
- Subjects
- *
MOLECULAR structure , *DRUG discovery , *PROTEIN structure , *DATABASES , *STRUCTURAL bioinformatics - Abstract
Although many studies have addressed the significance of interactions in the dimeric structures of proteins, there is no dedicated database of these interactions. To this end, the Molecular Interactions in Protein Dimer Structure (MIPDS) database has been developed; it is an open‐access repository containing 60 298 3D structures of dimeric proteins sourced from the Protein Data Bank. This helps researchers comprehend the types of interaction, which include those mediated by water, small molecules or ligands and direct interactions, in 3D structures at the molecular level. The database is accessible through a user‐friendly interface, where users can conduct searches based on PDB accession number, interaction type and geometric parameters. It can be viewed in textual and graphical formats using the plug‐in JSmol. MIPDS is updated weekly using programmed scripts to incorporate newly released dimeric structures and analyses of their interaction types. The database is intended for the scientific community working in structural biology, structural bioinformatics, drug discovery and development. MIPDS is freely accessible to users worldwide at http://dicsoft1.physics.iisc.ac.in/mipds. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. Aggrescan4D: A comprehensive tool for pH‐dependent analysis and engineering of protein aggregation propensity.
- Author
-
Zalewski, Mateusz, Iglesias, Valentin, Bárcenas, Oriol, Ventura, Salvador, and Kmiecik, Sebastian
- Abstract
Aggrescan4D (A4D) is an advanced computational tool designed for predicting protein aggregation, leveraging structural information and the influence of pH. Building upon its predecessor, Aggrescan3D (A3D), A4D has undergone numerous enhancements aimed at assisting the improvement of protein solubility. This manuscript reviews A4D's updated functionalities and explains the fundamental principles behind its pH‐dependent calculations. Additionally, it presents an antibody case study to evaluate its performance in comparison with other structure‐based predictors. Notably, A4D integrates advanced protein engineering protocols with pH‐dependent calculations, enhancing its utility in advising solubility‐enhancing mutations. A4D considers the impact of structural flexibility on aggregation propensities, and includes a large set of precalculated predictions. These capabilities should help to open new avenues for both understanding and managing protein aggregation. A4D is accessible through a dedicated web server at https://biocomp.chem.uw.edu.pl/a4d/. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Tertiary structure assessment at CASP15
- Author
-
Simpkin, Adam J, Mesdaghi, Shahram, Rodríguez, Filomeno Sánchez, Elliott, Luc, Murphy, David L, Kryshtafovych, Andriy, Keegan, Ronan M, and Rigden, Daniel J
- Subjects
Biochemistry and Cell Biology ,Biological Sciences ,Furylfuramide ,Computational Biology ,Models ,Molecular ,Proteins ,Sequence Alignment ,CASP15 ,machine learning ,molecular replacement ,protein modelling ,protein structure prediction ,structural bioinformatics ,Mathematical Sciences ,Information and Computing Sciences ,Bioinformatics ,Biological sciences ,Mathematical sciences - Abstract
The results of tertiary structure assessment at CASP15 are reported. For the first time, recognizing the outstanding performance of AlphaFold 2 (AF2) at CASP14, all single-chain predictions were assessed together, irrespective of whether a template was available. At CASP15, there was no single stand-out group, with most of the best-scoring groups-led by PEZYFoldings, UM-TBM, and Yang Server-employing AF2 in one way or another. Many top groups paid special attention to generating deep Multiple Sequence Alignments (MSAs) and testing variant MSAs, thereby allowing them to successfully address some of the hardest targets. Such difficult targets, as well as lacking templates, were typically proteins with few homologues. Local divergence between prediction and target correlated with localization at crystal lattice or chain interfaces, and with regions exhibiting high B-factor factors in crystal structure targets, and should not necessarily be considered as representing error in the prediction. However, analysis of exposed and buried side chain accuracy showed room for improvement even in the latter. Nevertheless, a majority of groups produced high-quality predictions for most targets, which are valuable for experimental structure determination, functional analysis, and many other tasks across biology. These include those applying methods similar to those used to generate major resources such as the AlphaFold Protein Structure Database and the ESM Metagenomic atlas: the confidence estimates of the former were also notably accurate.
- Published
- 2023
28. Respiratory tract infections: an update on the complexity of bacterial diversity, therapeutic interventions and breakthroughs.
- Author
-
Panickar, Avani, Manoharan, Anand, Anbarasu, Anand, and Ramaiah, Sudha
- Abstract
Respiratory tract infections (RTIs) have a significant impact on global health, especially among children and the elderly. The key bacterial pathogens Streptococcus pneumoniae, Haemophilus influenzae, Klebsiella pneumoniae, Staphylococcus aureus and non-fermenting Gram Negative bacteria such as Acinetobacter baumannii and Pseudomonas aeruginosa are most commonly associated with RTIs. These bacterial pathogens have evolved a diverse array of resistance mechanisms through horizontal gene transfer, often mediated by mobile genetic elements and environmental acquisition. Treatment failures are primarily due to antimicrobial resistance and inadequate bacterial engagement, which necessitates the development of alternative treatment strategies. To overcome this, our review mainly focuses on different virulence mechanisms and their resulting pathogenicity, highlighting different therapeutic interventions to combat resistance. To prevent the antimicrobial resistance crisis, we also focused on leveraging the application of artificial intelligence and machine learning to manage RTIs. Integrative approaches combining mechanistic insights are crucial for addressing the global challenge of antimicrobial resistance in respiratory infections. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. A Computational Approach in the Systematic Search of the Interaction Partners of Alternatively Spliced TREM2 Isoforms †.
- Author
-
Liang, Junyi, Menon, Aditya, Tomco, Taylor, Bhattarai, Nisha, Smith, Iris Nira, Khrestian, Maria, Formica, Shane V., Eng, Charis, Buck, Matthias, and Bekris, Lynn M.
- Subjects
- *
MEMBRANE glycoproteins , *ALZHEIMER'S disease , *STRUCTURAL bioinformatics , *NEUROFIBRILLARY tangles , *MYELOID cells - Abstract
Alzheimer's disease is the most common form of dementia, characterized by the pathological accumulation of amyloid-beta (Aβ) plaques and tau neurofibrillary tangles. Triggering receptor expressed on myeloid cells 2 (TREM2) is increasingly recognized as playing a central role in Aβ clearance and microglia activation in AD. The TREM2 gene transcriptional product is alternatively spliced to produce three different protein isoforms. The canonical TREM2 isoform binds to DAP12 to activate downstream pathways. However, little is known about the function or interaction partners of the alternative TREM2 isoforms. The present study utilized a computational approach in a systematic search for new interaction partners of the TREM2 isoforms by integrating several state-of-the-art structural bioinformatics tools from initial large-scale screening to one-on-one corroborative modeling and eventual all-atom visualization. CD9, a cell surface glycoprotein involved in cell–cell adhesion and migration, was identified as a new interaction partner for two TREM2 isoforms, and CALM, a calcium-binding protein involved in calcium signaling, was identified as an interaction partner for a third TREM2 isoform, highlighting the potential role of cell adhesion and calcium regulation in AD. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Structural determinants of Vibrio cholerae FeoB nucleotide promiscuity.
- Author
-
Lee, Mark, Magante, Kate, Gómez-Garzón, Camilo, Payne, Shelley M., and Smith, Aaron T.
- Subjects
- *
VIBRIO cholerae , *ADENINE nucleotides , *ISOTHERMAL titration calorimetry , *PATHOGENIC bacteria , *STRUCTURAL bioinformatics - Abstract
Ferrous iron (Fe2+) is required for the growth and virulence of many pathogenic bacteria, including Vibrio cholerae (Vc), the causative agent of the disease cholera. For this bacterium, Feo is the primary system that transports Fe2+ into the cytosol. FeoB, the main component of this system, is regulated by a soluble cytosolic domain termed NFeoB. Recent reanalysis has shown that NFeoBs can be classified as either GTP-specific or NTP-promiscuous, but the structural and mechanistic bases for these differences were not known. To explore this intriguing property of FeoB, we solved the X-ray crystal structures of VcNFeoB in both the apo and the GDP-bound forms. Surprisingly, this promiscuous NTPase displayed a canonical NFeoB G-protein fold like GTP-specific NFeoBs. Using structural bioinformatics, we hypothesized that residues surrounding the nucleobase could be important for both nucleotide affinity and specificity. We then solved the X-ray crystal structures of N150T VcNFeoB in the apo and GDP-bound forms to reveal H-bonding differences surrounding the guanine nucleobase. Interestingly, isothermal titration calorimetry revealed similar binding thermodynamics of the WT and N150T proteins to guanine nucleotides, while the behavior in the presence of adenine nucleotides was dramatically different. AlphaFold models of VcNFeoB in the presence of ADP and ATP showed important conformational changes that contribute to nucleotide specificity among FeoBs. Combined, these results provide a structural framework for understanding FeoB nucleotide promiscuity, which could be an adaptive measure utilized by pathogens to ensure adequate levels of intracellular iron across multiple metabolic landscapes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Geometric descriptors for beta turns.
- Author
-
Newell, Nicholas E.
- Abstract
Beta turns, in which the protein backbone abruptly changes direction over four amino acid residues, are the most common type of protein secondary structure after alpha helices and beta sheets and play key structural and functional roles. Previous work has produced classification systems for turn geometry at multiple levels of precision, but these operate in backbone dihedral‐angle (Ramachandran) space, and the absence of a local Euclidean‐space coordinate system and structural alignment for turns, or of any systematic Euclidean‐space characterization of turn backbone shape, presents challenges for the visualization, comparison and analysis of the wide range of turn conformations and the design of turns and the structures that incorporate them. This work derives a turn‐local coordinate system that implicitly aligns turns, together with a set of geometric descriptors that characterize the bulk BB shapes of turns and describe modes of structural variation not explicitly captured by existing systems. These modes are shown to be meaningful by the demonstration of clear relationships between descriptor values and the electrostatic energy of the beta‐turn H‐bond, the overrepresentations of key side‐chain motifs, and the structural contexts of turns. Geometric turn descriptors complement Ramachandran‐space classifications, and they can be used to select turn structures for compatibility with particular side‐chain interactions or contexts. Potential applications include protein design and other tasks in which an enhanced Euclidean‐space characterization of turns may improve understanding or performance. The web‐based tools ExploreTurns, MapTurns, and ProfileTurn, available at www.betaturn.com, incorporate turn‐local coordinates and turn descriptors and demonstrate their utility. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Novel druggable space in human KRAS G13D discovered using structural bioinformatics and a P-loop targeting monoclonal antibody.
- Author
-
Jungholm, Oscar, Trkulja, Carolina, Moche, Martin, Srinivasa, Sreesha P., Christakopoulou, Maria-Nefeli, Davidson, Max, Reymer, Anna, Jardemark, Kent, Fogaça, Rafaela Lenza, Ashok, Anaswara, Jeffries, Gavin, Ampah-Korsah, Henry, Strandback, Emilia, Andréll, Juni, Nyman, Tomas, Nouairia, Ghada, and Orwar, Owe
- Subjects
- *
STRUCTURAL bioinformatics , *MONOCLONAL antibodies , *X-ray crystallography , *SMALL molecules , *RAS oncogenes , *FLUORESCENCE microscopy - Abstract
KRAS belongs to a family of small GTPases that act as binary switches upstream of several signalling cascades, controlling proliferation and survival of cells. Mutations in KRAS drive oncogenesis, especially in pancreatic, lung, and colorectal cancers (CRC). Although historic attempts at targeting mutant KRAS with small molecule inhibitors have proven challenging, there are recent successes with the G12C, and G12D mutations. However, clinically important RAS mutations such as G12V, G13D, Q61L, and A146T, remain elusive drug targets, and insights to their structural landscape is of critical importance to develop novel, and effective therapeutic concepts. We present a fully open, P-loop exposing conformer of KRAS G13D by X-ray crystallography at 1.4–2.4 Å resolution in Mg2+-free phosphate and malonate buffers. The G13D conformer has the switch-I region displaced in an upright position leaving the catalytic core fully exposed. To prove that this state is druggable, we developed a P-loop-targeting monoclonal antibody (mAb). The mAb displayed high-affinity binding to G13D and was shown using high resolution fluorescence microscopy to be spontaneously taken up by G13D-mutated HCT 116 cells (human CRC derived) by macropinocytosis. The mAb inhibited KRAS signalling in phosphoproteomic and genomic studies. Taken together, the data propose novel druggable space of G13D that is reachable in the cellular context. It is our hope that these findings will stimulate attempts to drug this fully open state G13D conformer using mAbs or other modalities. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Predicting Drug-Target Affinity Using Protein Pocket and Graph Convolution Network
- Author
-
Li, Yunhai, Li, Pengpai, Sun, Duanchen, Liu, Zhi-Ping, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Peng, Wei, editor, Cai, Zhipeng, editor, and Skums, Pavel, editor
- Published
- 2024
- Full Text
- View/download PDF
34. Identification of type VI secretion system effector-immunity pairs using structural bioinformatics
- Author
-
Alexander M Geller, Maor Shalom, David Zlotkin, Noam Blum, and Asaf Levy
- Subjects
Alphafold-multimer ,Effector-immunity Pairs ,Foldseek ,Structural Bioinformatics ,Type VI Secretion System (T6SS) ,Biology (General) ,QH301-705.5 ,Medicine (General) ,R5-920 - Abstract
Abstract The type VI secretion system (T6SS) is an important mediator of microbe–microbe and microbe–host interactions. Gram-negative bacteria use the T6SS to inject T6SS effectors (T6Es), which are usually proteins with toxic activity, into neighboring cells. Antibacterial effectors have cognate immunity proteins that neutralize self-intoxication. Here, we applied novel structural bioinformatic tools to perform systematic discovery and functional annotation of T6Es and their cognate immunity proteins from a dataset of 17,920 T6SS-encoding bacterial genomes. Using structural clustering, we identified 517 putative T6E families, outperforming sequence-based clustering. We developed a logistic regression model to reliably quantify protein–protein interaction of new T6E-immunity pairs, yielding candidate immunity proteins for 231 out of the 517 T6E families. We used sensitive structure-based annotation which yielded functional annotations for 51% of the T6E families, again outperforming sequence-based annotation. Next, we validated four novel T6E-immunity pairs using basic experiments in E. coli. In particular, we showed that the Pfam domain DUF3289 is a homolog of Colicin M and that DUF943 acts as its cognate immunity protein. Furthermore, we discovered a novel T6E that is a structural homolog of SleB, a lytic transglycosylase, and identified a specific glutamate that acts as its putative catalytic residue. Overall, this study applies novel structural bioinformatic tools to T6E-immunity pair discovery, and provides an extensive database of annotated T6E-immunity pairs.
- Published
- 2024
- Full Text
- View/download PDF
35. Identification of type VI secretion system effector-immunity pairs using structural bioinformatics.
- Author
-
Geller, Alexander M, Shalom, Maor, Zlotkin, David, Blum, Noam, and Levy, Asaf
- Subjects
STRUCTURAL bioinformatics ,ESCHERICHIA coli ,SECRETION ,BACTERIAL genomes ,PROTEIN-protein interactions - Abstract
The type VI secretion system (T6SS) is an important mediator of microbe–microbe and microbe–host interactions. Gram-negative bacteria use the T6SS to inject T6SS effectors (T6Es), which are usually proteins with toxic activity, into neighboring cells. Antibacterial effectors have cognate immunity proteins that neutralize self-intoxication. Here, we applied novel structural bioinformatic tools to perform systematic discovery and functional annotation of T6Es and their cognate immunity proteins from a dataset of 17,920 T6SS-encoding bacterial genomes. Using structural clustering, we identified 517 putative T6E families, outperforming sequence-based clustering. We developed a logistic regression model to reliably quantify protein–protein interaction of new T6E-immunity pairs, yielding candidate immunity proteins for 231 out of the 517 T6E families. We used sensitive structure-based annotation which yielded functional annotations for 51% of the T6E families, again outperforming sequence-based annotation. Next, we validated four novel T6E-immunity pairs using basic experiments in E. coli. In particular, we showed that the Pfam domain DUF3289 is a homolog of Colicin M and that DUF943 acts as its cognate immunity protein. Furthermore, we discovered a novel T6E that is a structural homolog of SleB, a lytic transglycosylase, and identified a specific glutamate that acts as its putative catalytic residue. Overall, this study applies novel structural bioinformatic tools to T6E-immunity pair discovery, and provides an extensive database of annotated T6E-immunity pairs. Synopsis: Structural bioinformatic tools were utilized for the discovery of novel specialized Type VI Secretion System (T6SS) effectors and their cognate immunity proteins, highlighting their utility over standard sequence-based tools. The effector predictions were supported by experimental results. Structural clustering provided better compression of effectors than sequence-based methods, with 517 structural clusters representing the structure space of specialized effectors in Proteobacteria. The ipTM score from Alphafold-multimer was used as a reliable and quantitative measure for predicting candidate immunity proteins in 231 out of 517 effector clusters. Annotations were provided for 265 out of the 517 specialized effector domain families using fast and sensitive searches with Foldseek, expanding capabilities beyond Pfam-based annotation alone. Four putative effectors were demonstrated to be toxic to Escherichia coli, with co-expression of cognate immunity proteins neutralizing their toxicity. Structural bioinformatic tools were utilized for the discovery of novel specialized Type VI Secretion System (T6SS) effectors and their cognate immunity proteins, highlighting their utility over standard sequence-based tools. The effector predictions were supported by experimental results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. EGG: Accuracy Estimation of Individual Multimeric Protein Models Using Deep Energy-Based Models and Graph Neural Networks.
- Author
-
Siciliano, Andrew Jordan, Zhao, Chenguang, Liu, Tong, and Wang, Zheng
- Subjects
- *
GRAPH neural networks , *PROTEIN models , *EGGS , *QUATERNARY structure , *SYSTEMS biology , *CYTOLOGY - Abstract
Reliable and accurate methods of estimating the accuracy of predicted protein models are vital to understanding their respective utility. Discerning how the quaternary structure conforms can significantly improve our collective understanding of cell biology, systems biology, disease formation, and disease treatment. Accurately determining the quality of multimeric protein models is still computationally challenging, as the space of possible conformations is significantly larger when proteins form in complex with one another. Here, we present EGG (energy and graph-based architectures) to assess the accuracy of predicted multimeric protein models. We implemented message-passing and transformer layers to infer the overall fold and interface accuracy scores of predicted multimeric protein models. When evaluated with CASP15 targets, our methods achieved promising results against single model predictors: fourth and third place for determining the highest-quality model when estimating overall fold accuracy and overall interface accuracy, respectively, and first place for determining the top three highest quality models when estimating both overall fold accuracy and overall interface accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. ASCC1 structures and bioinformatics reveal a novel helix-clasp-helix RNA-binding motif linked to a two-histidine phosphodiesterase.
- Author
-
Chinnam, Naga babu, Thapar, Roopa, Arvai, Andrew S., Sarker, Altaf H., Soll, Jennifer M., Paul, Tanmoy, Syed, Aleem, Rosenberg, Daniel J., Hammel, Michal, Bacolla, Albino, Katsonis, Panagiotis, Asthana, Abhishek, Tsai, Miaw-Sheue, Ivanov, Ivaylo, Lichtarge, Olivier, Silverman, Robert H., Mosammaparast, Nima, Tsutakawa, Susan E., and Tainer, John A.
- Subjects
- *
STRUCTURAL bioinformatics , *SPINAL muscular atrophy , *SMALL-angle X-ray scattering , *NON-coding RNA , *DIHEDRAL angles , *CIRCULAR RNA - Abstract
Activating signal co-integrator complex 1 (ASCC1) acts with ASCC-ALKBH3 complex in alkylation damage responses. ASCC1 uniquely combines two evolutionarily ancient domains: nucleotide-binding K-Homology (KH) (associated with regulating splicing, transcriptional, and translation) and two-histidine phosphodiesterase (PDE; associated with hydrolysis of cyclic nucleotide phosphate bonds). Germline mutations link loss of ASCC1 function to spinal muscular atrophy with congenital bone fractures 2 (SMABF2). Herein analysis of The Cancer Genome Atlas (TCGA) suggests ASCC1 RNA overexpression in certain tumors correlates with poor survival, Signatures 29 and 3 mutations, and genetic instability markers. We determined crystal structures of Alvinella pompejana (Ap) ASCC1 and Human (Hs) PDE domain revealing high-resolution details and features conserved over 500 million years of evolution. Extending our understanding of the KH domain Gly-X-X-Gly sequence motif, we define a novel structural Helix-Clasp-Helix (HCH) nucleotide binding motif and show ASCC1 sequence-specific binding to CGCG-containing RNA. The V-shaped PDE nucleotide binding channel has two His-Φ-Ser/Thr-Φ (HXT) motifs (Φ being hydrophobic) positioned to initiate cyclic phosphate bond hydrolysis. A conserved atypical active-site histidine torsion angle implies a novel PDE substrate. Flexible active site loop and arginine-rich domain linker appear regulatory. Small-angle X-ray scattering (SAXS) revealed aligned KH-PDE RNA binding sites with limited flexibility in solution. Quantitative evolutionary bioinformatic analyses of disease and cancer-associated mutations support implied functional roles for RNA binding, phosphodiesterase activity, and regulation. Collective results inform ASCC1's roles in transactivation and alkylation damage responses, its targeting by structure-based inhibitors, and how ASCC1 mutations may impact inherited disease and cancer. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. RCSB protein Data Bank: exploring protein 3D similarities via comprehensive structural alignments.
- Author
-
Bittrich, Sebastian, Segura, Joan, Duarte, Jose M, Burley, Stephen K, and Rose, Yana
- Subjects
- *
BANKING industry , *WEB portals , *PROTEIN structure , *STRUCTURAL bioinformatics , *PROTEINS , *SYNTHETIC biology , *UNIFORM Resource Locators - Abstract
Motivation Tools for pairwise alignments between 3D structures of proteins are of fundamental importance for structural biology and bioinformatics, enabling visual exploration of evolutionary and functional relationships. However, the absence of a user-friendly, browser-based tool for creating alignments and visualizing them at both 1D sequence and 3D structural levels makes this process unnecessarily cumbersome. Results We introduce a novel pairwise structure alignment tool (rcsb.org/alignment) that seamlessly integrates into the RCSB Protein Data Bank (RCSB PDB) research-focused RCSB.org web portal. Our tool and its underlying application programming interface (alignment.rcsb.org) empowers users to align several protein chains with a reference structure by providing access to established alignment algorithms (FATCAT, CE, TM-align, or Smith–Waterman 3D). The user-friendly interface simplifies parameter setup and input selection. Within seconds, our tool enables visualization of results in both sequence (1D) and structural (3D) perspectives through the RCSB PDB RCSB.org Sequence Annotations viewer and Mol* 3D viewer, respectively. Users can effortlessly compare structures deposited in the PDB archive alongside more than a million incorporated Computed Structure Models coming from the ModelArchive and AlphaFold DB. Moreover, this tool can be used to align custom structure data by providing a link/URL or uploading atomic coordinate files directly. Importantly, alignment results can be bookmarked and shared with collaborators. By bridging the gap between 1D sequence and 3D structures of proteins, our tool facilitates deeper understanding of complex evolutionary relationships among proteins through comprehensive sequence and structural analyses. Availability and implementation The alignment tool is part of the RCSB PDB research-focused RCSB.org web portal and available at rcsb.org/alignment. Programmatic access is available via alignment.rcsb.org. Frontend code has been published at github.com/rcsb/rcsb-pecos-app. Visualization is powered by the open-source Mol* viewer (github.com/molstar/molstar and github.com/molstar/rcsb-molstar) plus the Sequence Annotations in 3D Viewer (github.com/rcsb/rcsb-saguaro-3d). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Acetylcholinesterase – glucose-regulated protein 78 binding site prediction, a hope to cure neurological disorders such as Alzheimer's disease.
- Author
-
Ali, Ahmed M., Mohamed, Ahmed A., Ibrahim, Ahmed N., and Elfiky, Abdo A.
- Abstract
Cerebral amyloid plaques in the brain define the elderly neuralgic disorder, Alzheimer's disease (AD). The enzyme Acetylcholinesterase (AChE) was reported to play a vital role in AD. It was shown that AChE induces amyloid fibril formation forming highly toxic AChE-Amyloid-β (Aβ) complexes. AChE can accelerate amyloid formation, and its inhibition could prevent such alterations to the enzyme. Understanding the proteostasis of AChE and its binding site to cellular chaperone GRP78 (Glucose-regulated protein 78) would help find a treatment for AD. In this study, the state of the art computational tools were utilized to predict the binding location of AChE that can stably associate with the cellular chaperone, GRP78. Sequence comparison along with molecular docking predicts two binding locations on AChE (C69–C96 and C257–C272) that could bind to GRP78 substrate binding domain β (SBDβ). The analysis of the docking data suggests that the former location has the best average binding affinity value (-12.16 kcal/mol) and average interaction pattern (13.9 ± 3.5 H-bonds, 5.5 ± 1.4 hydrophobic contacts, and 1.4 ± 1.2 salt bridges). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. Structure-based prediction of protein-nucleic acid binding using graph neural networks.
- Author
-
Sagendorf, Jared M., Mitra, Raktim, Huang, Jiawei, Chen, Xiaojiang S., and Rohs, Remo
- Abstract
Protein-nucleic acid (PNA) binding plays critical roles in the transcription, translation, regulation, and three-dimensional organization of the genome. Structural models of proteins bound to nucleic acids (NA) provide insights into the chemical, electrostatic, and geometric properties of the protein structure that give rise to NA binding but are scarce relative to models of unbound proteins. We developed a deep learning approach for predicting PNA binding given the unbound structure of a protein that we call PNAbind. Our method utilizes graph neural networks to encode the spatial distribution of physicochemical and geometric properties of protein structures that are predictive of NA binding. Using global physicochemical encodings, our models predict the overall binding function of a protein, and using local encodings, they predict the location of individual NA binding residues. Our models can discriminate between specificity for DNA or RNA binding, and we show that predictions made on computationally derived protein structures can be used to gain mechanistic understanding of chemical and structural features that determine NA recognition. Binding site predictions were validated against benchmark datasets, achieving AUROC scores in the range of 0.92–0.95. We applied our models to the HIV-1 restriction factor APOBEC3G and showed that our model predictions are consistent with and help explain experimental RNA binding data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Native dynamics and allosteric responses in PTP1B probed by high‐resolution HDX‐MS.
- Author
-
Woods, Virgil A., Abzalimov, Rinat R., and Keedy, Daniel A.
- Abstract
Protein tyrosine phosphatase 1B (PTP1B) is a validated therapeutic target for obesity, diabetes, and certain types of cancer. In particular, allosteric inhibitors hold potential for therapeutic use, but an incomplete understanding of conformational dynamics and allostery in this protein has hindered their development. Here, we interrogate solution dynamics and allosteric responses in PTP1B using high‐resolution hydrogen‐deuterium exchange mass spectrometry (HDX‐MS), an emerging and powerful biophysical technique. Using HDX‐MS, we obtain a detailed map of backbone amide exchange that serves as a proxy for the solution dynamics of apo PTP1B, revealing several flexible loops interspersed among more constrained and rigid regions within the protein structure, as well as local regions that exchange faster than expected from their secondary structure and solvent accessibility. We demonstrate that our HDX rate data obtained in solution adds value to estimates of conformational heterogeneity derived from a pseudo‐ensemble constructed from ~200 crystal structures of PTP1B. Furthermore, we report HDX‐MS maps for PTP1B with active‐site versus allosteric small‐molecule inhibitors. These maps suggest distinct and widespread effects on protein dynamics relative to the apo form, including changes in locations distal (>35 Å) from the respective ligand binding sites. These results illuminate that allosteric inhibitors of PTP1B can induce unexpected changes in dynamics that extend beyond the previously understood allosteric network. Together, our data suggest a model of BB3 allostery in PTP1B that combines conformational restriction of active‐site residues with compensatory liberation of distal residues that aid in entropic balancing. Overall, our work showcases the potential of HDX‐MS for elucidating aspects of protein conformational dynamics and allosteric effects of small‐molecule ligands and highlights the potential of integrating HDX‐MS alongside other complementary methods, such as room‐temperature X‐ray crystallography, NMR spectroscopy, and molecular dynamics simulations, to guide the development of new therapeutics. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. Deep Learning Methods for Binding Site Prediction in Protein Structures.
- Author
-
Geraseva, E. P.
- Abstract
This work is an overview of deep machine learning methods aimed at predicting binding sites in protein structures. Several classes of methods are selected: prediction of binding sites for small molecules, proteins, and nucleic acids. For each class, various approaches to prediction are considered (prediction of binding atoms, residues, surfaces, pockets). Specifics of feature selection and neural network architectures inherent to each class and approach are highlighted, and an attempt is made to explain these specifics and foresee the further direction of their development. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. Bioinformatics approach for structure modeling, vaccine design, and molecular docking of Brucella candidate proteins BvrR, OMP25, and OMP31.
- Author
-
Elrashedy, Alyaa, Nayel, Mohamed, Salama, Akram, Salama, Mohammed M., and Hasan, Mohamed E.
- Subjects
- *
MOLECULAR docking , *STRUCTURAL bioinformatics , *CYTOTOXIC T cells , *T cells , *BRUCELLA , *ZOONOSES - Abstract
Brucellosis is a zoonotic disease with significant economic and healthcare costs. Despite the eradication efforts, the disease persists. Vaccines prevent disease in animals while antibiotics cure humans with limitations. This study aims to design vaccines and drugs for brucellosis in animals and humans, using protein modeling, epitope prediction, and molecular docking of the target proteins (BvrR, OMP25, and OMP31). Tertiary structure models of three target proteins were constructed and assessed using RMSD, TM-score, C-score, Z-score, and ERRAT. The best models selected from AlphaFold and I-TASSER due to their superior performance according to CASP 12 – CASP 15 were chosen for further analysis. The motif analysis of best models using MotifFinder revealed two, five, and five protein binding motifs, however, the Motif Scan identified seven, six, and eight Post-Translational Modification sites (PTMs) in the BvrR, OMP25, and OMP31 proteins, respectively. Dominant B cell epitopes were predicted at (44–63, 85–93, 126–137, 193–205, and 208–237), (26–46, 52–71, 98–114, 142–155, and 183–200), and (29–45, 58–82, 119–142, 177–198, and 222–251) for the three target proteins. Additionally, cytotoxic T lymphocyte epitopes were detected at (173–181, 189–197, and 202–210), (61–69, 91–99, 159–167, and 181–189), and (3–11, 24–32, 167–175, and 216–224), while T helper lymphocyte epitopes were displayed at (39–53, 57–65, 150–158, 163–171), (79–87, 95–108, 115–123, 128–142, and 189–197), and (39–47, 109–123, 216–224, and 245–253), for the respective target protein. Furthermore, structure-based virtual screening of the ZINC and DrugBank databases using the docking MOE program was followed by ADMET analysis. The best five compounds of the ZINC database revealed docking scores ranged from (− 16.8744 to − 15.1922), (− 16.0424 to − 14.1645), and (− 14.7566 to − 13.3222) for the BvrR, OMP25, and OMP31, respectively. These compounds had good ADMET parameters and no cytotoxicity, while DrugBank compounds didn't meet Lipinski's rule criteria. Therefore, the five selected compounds from the ZINC20 databases may fulfill the pharmacokinetics and could be considered lead molecules for potentially inhibiting Brucella's proteins. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. Deep learning in structural bioinformatics: current applications and future perspectives.
- Author
-
Kumar, Niranjan and Srivastava, Rakesh
- Subjects
- *
STRUCTURAL bioinformatics , *DEEP learning , *SCIENTIFIC Revolution , *DRUG discovery - Abstract
In this review article, we explore the transformative impact of deep learning (DL) on structural bioinformatics, emphasizing its pivotal role in a scientific revolution driven by extensive data, accessible toolkits and robust computing resources. As big data continue to advance, DL is poised to become an integral component in healthcare and biology, revolutionizing analytical processes. Our comprehensive review provides detailed insights into DL, featuring specific demonstrations of its notable applications in bioinformatics. We address challenges tailored for DL, spotlight recent successes in structural bioinformatics and present a clear exposition of DL—from basic shallow neural networks to advanced models such as convolution, recurrent, artificial and transformer neural networks. This paper discusses the emerging use of DL for understanding biomolecular structures, anticipating ongoing developments and applications in the realm of structural bioinformatics. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. Molecular Dynamics Simulation of Kir6.2 Variants Reveals Potential Association with Diabetes Mellitus.
- Author
-
Elangeeb, Mohamed E., Elfaki, Imadeldin, Eleragi, Ali M. S., Ahmed, Elsadig Mohamed, Mir, Rashid, Alzahrani, Salem M., Bedaiwi, Ruqaiah I., Alharbi, Zeyad M., Mir, Mohammad Muzaffar, Ajmal, Mohammad Rehan, Tayeb, Faris Jamal, and Barnawi, Jameel
- Subjects
- *
POTASSIUM channels , *MOLECULAR dynamics , *MATURITY onset diabetes of the young , *DIABETES , *STRUCTURAL bioinformatics , *GENETIC testing - Abstract
Diabetes mellitus (DM) represents a problem for the healthcare system worldwide. DM has very serious complications such as blindness, kidney failure, and cardiovascular disease. In addition to the very bad socioeconomic impacts, it influences patients and their families and communities. The global costs of DM and its complications are huge and expected to rise by the year 2030. DM is caused by genetic and environmental risk factors. Genetic testing will aid in early diagnosis and identification of susceptible individuals or populations using ATP-sensitive potassium (KATP) channels present in different tissues such as the pancreas, myocardium, myocytes, and nervous tissues. The channels respond to different concentrations of blood sugar, stimulation by hormones, or ischemic conditions. In pancreatic cells, they regulate the secretion of insulin and glucagon. Mutations in the KCNJ11 gene that encodes the Kir6.2 protein (a major constituent of KATP channels) were reported to be associated with Type 2 DM, neonatal diabetes mellitus (NDM), and maturity-onset diabetes of the young (MODY). Kir6.2 harbors binding sites for ATP and phosphatidylinositol 4,5-diphosphate (PIP2). The ATP inhibits the KATP channel, while the (PIP2) activates it. A Kir6.2 mutation at tyrosine330 (Y330) was demonstrated to reduce ATP inhibition and predisposes to NDM. In this study, we examined the effect of mutations on the Kir6.2 structure using bioinformatics tools and molecular dynamic simulations (SIFT, PolyPhen, SNAP2, PANTHER, PhD&SNP, SNP&Go, I-Mutant, MuPro, MutPred, ConSurf, HOPE, and GROMACS). Our results indicated that M199R, R201H, R206H, and Y330H mutations influence Kir6.2 structure and function and therefore may cause DM. We conclude that MD simulations are useful techniques to predict the effects of mutations on protein structure. In addition, the M199R, R201H, R206H, and Y330H variant in the Kir6.2 protein may be associated with DM. These results require further verification in protein–protein interactions, Kir6.2 function, and case-control studies. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. Structural bioinformatics studies of glutamate transporters and their AlphaFold2 predicted water-soluble QTY variants and uncovering the natural mutations of L->Q, I->T, F->Y and Q->L, T->I and Y->F.
- Author
-
Karagöl, Alper, Karagöl, Taner, Smorodina, Eva, and Zhang, Shuguang
- Subjects
- *
GLUTAMATE transporters , *STRUCTURAL bioinformatics , *MEMBRANE proteins , *ISOELECTRIC focusing , *PROTEIN engineering , *AMINO acids , *LEUCINE , *PHENYLALANINE - Abstract
Glutamate transporters play key roles in nervous physiology by modulating excitatory neurotransmitter levels, when malfunctioning, involving in a wide range of neurological and physiological disorders. However, integral transmembrane proteins including the glutamate transporters remain notoriously difficult to study, due to their localization within the cell membrane. Here we present the structural bioinformatics studies of glutamate transporters and their water-soluble variants generated through QTY-code, a protein design strategy based on systematic amino acid substitutions. These include 2 structures determined by X-ray crystallography, cryo-EM, and 6 predicted by AlphaFold2, and their predicted water-soluble QTY variants. In the native structures of glutamate transporters, transmembrane helices contain hydrophobic amino acids such as leucine (L), isoleucine (I), and phenylalanine (F). To design water-soluble variants, these hydrophobic amino acids are systematically replaced by hydrophilic amino acids, namely glutamine (Q), threonine (T) and tyrosine (Y). The QTY variants exhibited water-solubility, with four having identical isoelectric focusing points (pI) and the other four having very similar pI. We present the superposed structures of the native glutamate transporters and their water-soluble QTY variants. The superposed structures displayed remarkable similarity with RMSD 0.528Å-2.456Å, despite significant protein transmembrane sequence differences (41.1%—>53.8%). Additionally, we examined the differences of hydrophobicity patches between the native glutamate transporters and their QTY variants. Upon closer inspection, we discovered multiple natural variations of L->Q, I->T, F->Y and Q->L, T->I, Y->F in these transporters. Some of these natural variations were benign and the remaining were reported in specific neurological disorders. We further investigated the characteristics of hydrophobic to hydrophilic substitutions in glutamate transporters, utilizing variant analysis and evolutionary profiling. Our structural bioinformatics studies not only provided insight into the differences between the hydrophobic helices and hydrophilic helices in the glutamate transporters, but they are also expected to stimulate further study of other water-soluble transmembrane proteins. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. Prop3D: A flexible, Python-based platform for machine learning with protein structural properties and biophysical data
- Author
-
Eli J. Draizen, John Readey, Cameron Mura, and Philip E. Bourne
- Subjects
Deep learning ,Machine learning ,Massively parallel workflows ,Protein structure ,Structural bioinformatics ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background Machine learning (ML) has a rich history in structural bioinformatics, and modern approaches, such as deep learning, are revolutionizing our knowledge of the subtle relationships between biomolecular sequence, structure, function, dynamics and evolution. As with any advance that rests upon statistical learning approaches, the recent progress in biomolecular sciences is enabled by the availability of vast volumes of sufficiently-variable data. To be useful, such data must be well-structured, machine-readable, intelligible and manipulable. These and related requirements pose challenges that become especially acute at the computational scales typical in ML. Furthermore, in structural bioinformatics such data generally relate to protein three-dimensional (3D) structures, which are inherently more complex than sequence-based data. A significant and recurring challenge concerns the creation of large, high-quality, openly-accessible datasets that can be used for specific training and benchmarking tasks in ML pipelines for predictive modeling projects, along with reproducible splits for training and testing. Results Here, we report ‘Prop3D’, a platform that allows for the creation, sharing and extensible reuse of libraries of protein domains, featurized with biophysical and evolutionary properties that can range from detailed, atomically-resolved physicochemical quantities (e.g., electrostatics) to coarser, residue-level features (e.g., phylogenetic conservation). As a community resource, we also supply a ‘Prop3D-20sf’ protein dataset, obtained by applying our approach to CATH . We have developed and deployed the Prop3D framework, both in the cloud and on local HPC resources, to systematically and reproducibly create comprehensive datasets via the Highly Scalable Data Service ( HSDS ). Our datasets are freely accessible via a public HSDS instance, or they can be used with accompanying Python wrappers for popular ML frameworks. Conclusion Prop3D and its associated Prop3D-20sf dataset can be of broad utility in at least three ways. Firstly, the Prop3D workflow code can be customized and deployed on various cloud-based compute platforms, with scalability achieved largely by saving the results to distributed HDF5 files via HSDS . Secondly, the linked Prop3D-20sf dataset provides a hand-crafted, already-featurized dataset of protein domains for 20 highly-populated CATH families; importantly, provision of this pre-computed resource can aid the more efficient development (and reproducible deployment) of ML pipelines. Thirdly, Prop3D-20sf’s construction explicitly takes into account (in creating datasets and data-splits) the enigma of ‘data leakage’, stemming from the evolutionary relationships between proteins.
- Published
- 2024
- Full Text
- View/download PDF
48. Deep learning for protein structure prediction and design—progress and applications
- Author
-
Jürgen Jänes and Pedro Beltrao
- Subjects
AlphaFold2 ,Structural Bioinformatics ,Protein Design ,Protein Conformations ,Structural Systems Biology ,Biology (General) ,QH301-705.5 ,Medicine (General) ,R5-920 - Abstract
Abstract Proteins are the key molecular machines that orchestrate all biological processes of the cell. Most proteins fold into three-dimensional shapes that are critical for their function. Studying the 3D shape of proteins can inform us of the mechanisms that underlie biological processes in living cells and can have practical applications in the study of disease mutations or the discovery of novel drug treatments. Here, we review the progress made in sequence-based prediction of protein structures with a focus on applications that go beyond the prediction of single monomer structures. This includes the application of deep learning methods for the prediction of structures of protein complexes, different conformations, the evolution of protein structures and the application of these methods to protein design. These developments create new opportunities for research that will have impact across many areas of biomedical research.
- Published
- 2024
- Full Text
- View/download PDF
49. An in-silico analysis of OGT gene association with diabetes mellitus.
- Author
-
Ayodele, Abigail O., Udosen, Brenda, Oluwagbemi, Olugbenga O., Oladipo, Elijah K., Omotuyi, Idowu, Isewon, Itunuoluwa, Nash, Oyekanmi, Soremekun, Opeyemi, and Fatumo, Segun
- Subjects
- *
DIABETES , *PROTEIN overexpression , *POST-translational modification , *STRUCTURAL bioinformatics , *DRUG analysis - Abstract
O-GlcNAcylation is a nutrient-sensing post-translational modification process. This cycling process involves two primary proteins: the O-linked N-acetylglucosamine transferase (OGT) catalysing the addition, and the glycoside hydrolase OGA (O-GlcNAcase) catalysing the removal of the O-GlCNAc moiety on nucleocytoplasmic proteins. This process is necessary for various critical cellular functions. The O-linked N-acetylglucosamine transferase (OGT) gene produces the OGT protein. Several studies have shown the overexpression of this protein to have biological implications in metabolic diseases like cancer and diabetes mellitus (DM). This study retrieved 159 SNPs with clinical significance from the SNPs database. We probed the functional effects, stability profile, and evolutionary conservation of these to determine their fit for this research. We then identified 7 SNPs (G103R, N196K, Y228H, R250C, G341V, L367F, and C845S) with predicted deleterious effects across the four tools used (PhD-SNPs, SNPs&Go, PROVEAN, and PolyPhen2). Proceeding with this, we used ROBETTA, a homology modelling tool, to model the proteins with these point mutations and carried out a structural bioinformatics method– molecular docking– using the Glide model of the Schrodinger Maestro suite. We used a previously reported inhibitor of OGT, OSMI-1, as the ligand for these mutated protein models. As a result, very good binding affinities and interactions were observed between this ligand and the active site residues within 4Å of OGT. We conclude that these mutation points may be used for further downstream analysis as drug targets for treating diabetes mellitus. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
50. Uncovering structural themes across cilia microtubule inner proteins with implications for human cilia function.
- Author
-
Andersen, Jens S., Vijayakumaran, Aaran, Godbehere, Christopher, Lorentzen, Esben, Mennella, Vito, and Schou, Kenneth Bødtker
- Subjects
TUBULINS ,CILIA & ciliary motion ,MICROTUBULES ,CELL division ,STRUCTURAL bioinformatics ,CENTROSOMES - Abstract
Centrosomes and cilia are microtubule-based superstructures vital for cell division, signaling, and motility. The once thought hollow lumen of their microtubule core structures was recently found to hold a rich meshwork of microtubule inner proteins (MIPs). To address the outstanding question of how distinct MIPs evolved to recognize microtubule inner surfaces, we applied computational sequence analyses, structure predictions, and experimental validation to uncover evolutionarily conserved microtubule- and MIP-binding modules named NWE, SNYG, and ELLEn, and PYG and GFG-repeat by their signature motifs. These modules intermix with MT-binding DM10-modules and Mn-repeats in 24 Chlamydomonas and 33 human proteins. The modules molecular characteristics provided keys to identify elusive cross-species homologs, hitherto unknown human MIP candidates, and functional properties for seven protein subfamilies, including the microtubule seam-binding NWE and ELLEn families. Our work defines structural innovations that underpin centriole and axoneme assembly and demonstrates that MIPs co-evolved with centrosomes and cilia. The inside surface of microtubules contains so-called microtubule inner proteins, but little is known about their identity. Here the authors use bioinformatics to identify structural motifs within this class of proteins and potential new members. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.