Author: "Marks DS" / Topic: proteins - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Marks DS"' showing total 11 results

Start Over Author "Marks DS" Topic proteins

11 results on '"Marks DS"'

1. Protein design using structure-based residue preferences.

Author: Ding D, Shaw AY, Sinai S, Rollins N, Prywes N, Savage DF, Laub MT, and Marks DS
Subjects: Amino Acids chemistry, Mutation, Proteins metabolism, Neural Networks, Computer
Abstract: Recent developments in protein design rely on large neural networks with up to 100s of millions of parameters, yet it is unclear which residue dependencies are critical for determining protein function. Here, we show that amino acid preferences at individual residues-without accounting for mutation interactions-explain much and sometimes virtually all of the combinatorial mutation effects across 8 datasets (R 2 ~ 78-98%). Hence, few observations (~100 times the number of mutated residues) enable accurate prediction of held-out variant effects (Pearson r > 0.80). We hypothesized that the local structural contexts around a residue could be sufficient to predict mutation preferences, and develop an unsupervised approach termed CoVES (Combinatorial Variant Effects from Structure). Our results suggest that CoVES outperforms not just model-free methods but also similarly to complex models for creating functional and diverse protein variants. CoVES offers an effective alternative to complicated models for identifying functional protein mutations., (© 2024. The Author(s).)
Published: 2024
Full Text: View/download PDF

2. Democratizing the mapping of gene mutations to protein biophysics.

Author: Marks DS and Michnick SW
Subjects: Biophysical Phenomena, Biophysics, Mutation, Biology, Proteins
Published: 2022
Full Text: View/download PDF

3. Disease variant prediction with deep generative models of evolutionary data.

Author: Frazer J, Notin P, Dias M, Gomez A, Min JK, Brock K, Gal Y, and Marks DS
Subjects: Bayes Theorem, Biological Assay, Genetic Predisposition to Disease genetics, Humans, Models, Molecular, Phenotype, Proteins metabolism, Disease genetics, Evolution, Molecular, Genetic Fitness genetics, Genetic Variation, Proteins genetics, Selection, Genetic, Unsupervised Machine Learning
Abstract: Quantifying the pathogenicity of protein variants in human disease-related genes would have a marked effect on clinical decisions, yet the overwhelming majority (over 98%) of these variants still have unknown consequences 1-3 . In principle, computational methods could support the large-scale interpretation of genetic variants. However, state-of-the-art methods 4-10 have relied on training machine learning models on known disease labels. As these labels are sparse, biased and of variable quality, the resulting models have been considered insufficiently reliable 11 . Here we propose an approach that leverages deep generative models to predict variant pathogenicity without relying on labels. By modelling the distribution of sequence variation across organisms, we implicitly capture constraints on the protein sequences that maintain fitness. Our model EVE (evolutionary model of variant effect) not only outperforms computational approaches that rely on labelled data but also performs on par with, if not better than, predictions from high-throughput experiments, which are increasingly used as evidence for variant classification 12-16 . We predict the pathogenicity of more than 36 million variants across 3,219 disease genes and provide evidence for the classification of more than 256,000 variants of unknown significance. Our work suggests that models of evolutionary information can provide valuable independent evidence for variant interpretation that will be widely useful in research and clinical settings., (© 2021. The Author(s), under exclusive licence to Springer Nature Limited.)
Published: 2021
Full Text: View/download PDF

4. Protein Structure from Experimental Evolution.

Author: Stiffler MA, Poelwijk FJ, Brock KP, Stein RR, Riesselman A, Teyra J, Sidhu SS, Marks DS, Gauthier NP, and Sander C
Subjects: Humans, Protein Conformation, Evolution, Molecular, Proteins chemistry
Abstract: Natural evolution encodes rich information about the structure and function of biomolecules in the genetic record. Previously, statistical analysis of co-variation patterns in natural protein families has enabled the accurate computation of 3D structures. Here, we explored generating similar information by experimental evolution, starting from a single gene and performing multiple cycles of in vitro mutagenesis and functional selection in Escherichia coli. We evolved two antibiotic resistance proteins, β-lactamase PSE1 and acetyltransferase AAC6, and obtained hundreds of thousands of diverse functional sequences. Using evolutionary coupling analysis, we inferred residue interaction constraints that were in agreement with contacts in known 3D structures, confirming genetic encoding of structural constraints in the selected sequences. Computational protein folding with interaction constraints then yielded 3D structures with the same fold as natural relatives. This work lays the foundation for a new experimental method (3Dseq) for protein structure determination, combining evolution experiments with inference of residue interactions from sequence information. A record of this paper's Transparent Peer Review process is included in the Supplemental Information., (Copyright © 2019 The Authors. Published by Elsevier Inc. All rights reserved.)
Published: 2020
Full Text: View/download PDF

5. A Hybrid Approach for Protein Structure Determination Combining Sparse NMR with Evolutionary Coupling Sequence Data.

Author: Huang YJ, Brock KP, Sander C, Marks DS, and Montelione GT
Subjects: Sequence Alignment, Evolution, Molecular, Nuclear Magnetic Resonance, Biomolecular, Protein Conformation, Proteins chemistry
Abstract: While 3D structure determination of small (<15 kDa) proteins by solution NMR is largely automated and routine, structural analysis of larger proteins is more challenging. An emerging hybrid strategy for modeling protein structures combines sparse NMR data that can be obtained for larger proteins with sequence co-variation data, called evolutionary couplings (ECs), obtained from multiple sequence alignments of protein families. This hybrid "EC-NMR" method can be used to accurately model larger (15-60 kDa) proteins, and more rapidly determine structures of smaller (5-15 kDa) proteins using only backbone NMR data. The resulting structures have accuracies relative to reference structures comparable to those obtained with full backbone and sidechain NMR resonance assignments. The requirement that evolutionary couplings (ECs) are consistent with NMR data recorded on a specific member of a protein family, under specific conditions, potentially also allows identification of ECs that reflect alternative allosteric or excited states of the protein structure.
Published: 2018
Full Text: View/download PDF

6. Protein structure determination by combining sparse NMR data with evolutionary couplings.

Author: Tang Y, Huang YJ, Hopf TA, Sander C, Marks DS, and Montelione GT
Subjects: Algorithms, Crystallography, X-Ray, Evolution, Molecular, Humans, Hydrodynamics, Imaging, Three-Dimensional, Models, Statistical, Molecular Conformation, Protein Conformation, Proto-Oncogene Proteins chemistry, Proto-Oncogene Proteins p21(ras), ras Proteins chemistry, Computational Biology methods, Magnetic Resonance Spectroscopy methods, Proteins chemistry
Abstract: Accurate determination of protein structure by NMR spectroscopy is challenging for larger proteins, for which experimental data are often incomplete and ambiguous. Evolutionary sequence information together with advances in maximum entropy statistical methods provide a rich complementary source of structural constraints. We have developed a hybrid approach (evolutionary coupling-NMR spectroscopy; EC-NMR) combining sparse NMR data with evolutionary residue-residue couplings and demonstrate accurate structure determination for several proteins 6-41 kDa in size.
Published: 2015
Full Text: View/download PDF

7. Inferring Pairwise Interactions from Biological Data Using Maximum-Entropy Probability Models.

Author: Stein RR, Marks DS, and Sander C
Subjects: Amino Acid Sequence, Binding Sites, Computer Simulation, Entropy, Molecular Sequence Data, Protein Binding, Algorithms, Models, Chemical, Models, Statistical, Protein Interaction Mapping methods, Proteins chemistry, Sequence Analysis, Protein methods
Abstract: Maximum entropy-based inference methods have been successfully used to infer direct interactions from biological datasets such as gene expression data or sequence ensembles. Here, we review undirected pairwise maximum-entropy probability models in two categories of data types, those with continuous and categorical random variables. As a concrete example, we present recently developed inference methods from the field of protein contact prediction and show that a basic set of assumptions leads to similar solution strategies for inferring the model parameters in both variable types. These parameters reflect interactive couplings between observables, which can be used to predict global properties of the biological system. Such methods are applicable to the important problems of protein 3-D structure prediction and association of gene-gene networks, and they enable potential applications to the analysis of gene alteration patterns and to protein design.
Published: 2015
Full Text: View/download PDF

8. FreeContact: fast and free software for protein contact prediction from residue co-evolution.

Author: Kaján L, Hopf TA, Kalaš M, Marks DS, and Rost B
Subjects: Algorithms, Protein Conformation, Proteins genetics, Computational Biology methods, Proteins chemistry, Sequence Analysis, Protein methods, Software
Abstract: Background: 20 years of improved technology and growing sequences now renders residue-residue contact constraints in large protein families through correlated mutations accurate enough to drive de novo predictions of protein three-dimensional structure. The method EVfold broke new ground using mean-field Direct Coupling Analysis (EVfold-mfDCA); the method PSICOV applied a related concept by estimating a sparse inverse covariance matrix. Both methods (EVfold-mfDCA and PSICOV) are publicly available, but both require too much CPU time for interactive applications. On top, EVfold-mfDCA depends on proprietary software., Results: Here, we present FreeContact, a fast, open source implementation of EVfold-mfDCA and PSICOV. On a test set of 140 proteins, FreeContact was almost eight times faster than PSICOV without decreasing prediction performance. The EVfold-mfDCA implementation of FreeContact was over 220 times faster than PSICOV with negligible performance decrease. EVfold-mfDCA was unavailable for testing due to its dependency on proprietary software. FreeContact is implemented as the free C++ library "libfreecontact", complete with command line tool "freecontact", as well as Perl and Python modules. All components are available as Debian packages. FreeContact supports the BioXSD format for interoperability., Conclusions: FreeContact provides the opportunity to compute reliable contact predictions in any environment (desktop or cloud).
Published: 2014
Full Text: View/download PDF

9. Protein structure prediction from sequence variation.

Author: Marks DS, Hopf TA, and Sander C
Subjects: Amino Acid Sequence, Computer Simulation, Molecular Sequence Data, Protein Conformation, Genetic Variation genetics, Models, Chemical, Models, Genetic, Models, Molecular, Proteins chemistry, Proteins genetics, Sequence Analysis, Protein methods
Abstract: Genomic sequences contain rich evolutionary information about functional constraints on macromolecules such as proteins. This information can be efficiently mined to detect evolutionary couplings between residues in proteins and address the long-standing challenge to compute protein three-dimensional structures from amino acid sequences. Substantial progress has recently been made on this problem owing to the explosive growth in available sequences and the application of global statistical methods. In addition to three-dimensional structure, the improved understanding of covariation may help identify functional residues involved in ligand binding, protein-complex formation and conformational changes. We expect computation of covariation patterns to complement experimental structural biology in elucidating the full spectrum of protein structures, their functional interactions and evolutionary dynamics.
Published: 2012
Full Text: View/download PDF

10. Direct-coupling analysis of residue coevolution captures native contacts across many protein families.

Author: Morcos F, Pagnani A, Lunt B, Bertolino A, Marks DS, Sander C, Zecchina R, Onuchic JN, Hwa T, and Weigt M
Subjects: Amino Acids genetics, Amino Acids metabolism, Binding Sites genetics, Models, Molecular, Protein Binding, Protein Conformation, Protein Interaction Mapping methods, Protein Multimerization, Proteins genetics, Proteins metabolism, Reproducibility of Results, Algorithms, Amino Acids chemistry, Computational Biology methods, Proteins chemistry
Abstract: The similarity in the three-dimensional structures of homologous proteins imposes strong constraints on their sequence variability. It has long been suggested that the resulting correlations among amino acid compositions at different sequence positions can be exploited to infer spatial contacts within the tertiary protein structure. Crucial to this inference is the ability to disentangle direct and indirect correlations, as accomplished by the recently introduced direct-coupling analysis (DCA). Here we develop a computationally efficient implementation of DCA, which allows us to evaluate the accuracy of contact prediction by DCA for a large number of protein domains, based purely on sequence information. DCA is shown to yield a large number of correctly predicted contacts, recapitulating the global structure of the contact map for the majority of the protein domains examined. Furthermore, our analysis captures clear signals beyond intradomain residue contacts, arising, e.g., from alternative protein conformations, ligand-mediated residue couplings, and interdomain interactions in protein oligomers. Our findings suggest that contacts predicted by DCA can be used as a reliable guide to facilitate computational predictions of alternative protein conformations, protein complex formation, and even the de novo prediction of protein domain structures, contingent on the existence of a large number of homologous sequences which are being rapidly made available due to advances in genome sequencing.
Published: 2011
Full Text: View/download PDF

11. Protein 3D structure computed from evolutionary sequence variation.

Author: Marks DS, Colwell LJ, Sheridan R, Hopf TA, Pagnani A, Zecchina R, and Sander C
Subjects: Animals, Catalytic Domain, Computational Biology methods, Drug Design, Entropy, Evolution, Molecular, Genetic Variation, Genome, Humans, Imaging, Three-Dimensional, Models, Statistical, Protein Conformation, Protein Structure, Tertiary, Reproducibility of Results, Rhodopsin chemistry, Trypsin chemistry, Proteins chemistry
Abstract: The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing.In this paper we ask whether we can infer evolutionary constraints from a set of sequence homologs of a protein. The challenge is to distinguish true co-evolution couplings from the noisy set of observed correlations. We address this challenge using a maximum entropy model of the protein sequence, constrained by the statistics of the multiple sequence alignment, to infer residue pair couplings. Surprisingly, we find that the strength of these inferred couplings is an excellent predictor of residue-residue proximity in folded structures. Indeed, the top-scoring residue couplings are sufficiently accurate and well-distributed to define the 3D protein fold with remarkable accuracy.We quantify this observation by computing, from sequence alone, all-atom 3D structures of fifteen test proteins from different fold classes, ranging in size from 50 to 260 residues, including a G-protein coupled receptor. These blinded inferences are de novo, i.e., they do not use homology modeling or sequence-similar fragments from known structures. The co-evolution signals provide sufficient information to determine accurate 3D protein structure to 2.7-4.8 Å C(α)-RMSD error relative to the observed structure, over at least two-thirds of the protein (method called EVfold, details at http://EVfold.org). This discovery provides insight into essential interactions constraining protein evolution and will facilitate a comprehensive survey of the universe of protein structures, new strategies in protein and drug design, and the identification of functional genetic variants in normal and disease genomes.
Published: 2011
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

11 results on '"Marks DS"'

1. Protein design using structure-based residue preferences.

2. Democratizing the mapping of gene mutations to protein biophysics.

3. Disease variant prediction with deep generative models of evolutionary data.

4. Protein Structure from Experimental Evolution.

5. A Hybrid Approach for Protein Structure Determination Combining Sparse NMR with Evolutionary Coupling Sequence Data.

6. Protein structure determination by combining sparse NMR data with evolutionary couplings.

7. Inferring Pairwise Interactions from Biological Data Using Maximum-Entropy Probability Models.

8. FreeContact: fast and free software for protein contact prediction from residue co-evolution.

9. Protein structure prediction from sequence variation.

10. Direct-coupling analysis of residue coevolution captures native contacts across many protein families.

11. Protein 3D structure computed from evolutionary sequence variation.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

11 results on '"Marks DS"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources