Descriptor: "Databases, Chemical statistics & numerical data" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Databases, Chemical statistics & numerical data"' showing total 32 results

Start Over Descriptor "Databases, Chemical statistics & numerical data"

32 results on '"Databases, Chemical statistics & numerical data"'

1. Major chemical database investigates hundreds of suspicious crystal structures.

Author: Else H
Subjects: Crystallography, X-Ray standards, Databases, Chemical standards, Databases, Chemical statistics & numerical data, Fraud statistics & numerical data, Scientific Misconduct statistics & numerical data
Published: 2022
Full Text: View/download PDF

2. Epigenetic Target Fishing with Accurate Machine Learning Models.

Author: Sánchez-Cruz N and Medina-Franco JL
Subjects: Databases, Chemical statistics & numerical data, Histone Deacetylases metabolism, Molecular Structure, Organic Chemicals metabolism, Proof of Concept Study, Structure-Activity Relationship, Transcription Factors metabolism, Drug Discovery methods, Epigenomics methods, Machine Learning, Organic Chemicals chemistry
Abstract: Epigenetic targets are of significant importance in drug discovery research, as demonstrated by the eight approved epigenetic drugs for treatment of cancer and the increasing availability of chemogenomic data related to epigenetics. This data represents many structure-activity relationships that have not been exploited thus far to develop predictive models to support medicinal chemistry efforts. Herein, we report the first large-scale study of 26 318 compounds with a quantitative measure of biological activity for 55 protein targets with epigenetic activity. We built predictive models with high accuracy for small molecules' epigenetic target profiling through a systematic comparison of the machine learning models trained on different molecular fingerprints. The models were thoroughly validated, showing mean precisions of up to 0.952 for the epigenetic target prediction task. Our results indicate that the models reported herein have considerable potential to identify small molecules with epigenetic activity. Therefore, our results were implemented as a freely accessible web application.
Published: 2021
Full Text: View/download PDF

3. Development of an a priori computational approach for brain uptake of compounds in an insect model system.

Author: Geldenhuys WJ and Bloomquist JR
Subjects: Animals, Central Nervous System Agents chemistry, Cheminformatics, Databases, Chemical statistics & numerical data, Grasshoppers metabolism, Linear Models, Models, Biological, Neural Networks, Computer, Support Vector Machine, Brain metabolism, Central Nervous System Agents metabolism
Abstract: Delivery of compounds to the brain is critical for the development of effective treatment therapies of multiple central nervous system diseases. Recently a novel insect-based brain uptake model was published utilizing a locust brain ex vivo system. The goal of our study was to develop a priori, in silico cheminformatic models to describe brain uptake in this insect model, as well as evaluate the predictive ability. The machine learning program Orange® was used to evaluate several machine learning (ML) models on a published data set of 25 known drugs, with in vitro data generated by a single laboratory group to reduce inherent inter-laboratory variability. The ML models included in this study were linear regression (LR), support vector machines (SVN), k-nearest neighbor (kNN) and neural nets (NN). The quantitative structure-property relationship models were able to correlate experimental logCtot (concentration of compound in brain) and predicted brain uptake of r 2 > 0.5, with the descriptors log(P*MW -0.5 ) and hydrogen bond donor used in LR, SVN and KNN, while log(P*MW -0.5 ) and total polar surface area (TPSA) descriptors used in the NN models. Our results indicate that the locust insect model is amenable to data mining chemoinformatics and in silico model development in CNS drug discovery pipelines., (Copyright © 2021 Elsevier Ltd. All rights reserved.)
Published: 2021
Full Text: View/download PDF

4. Identification of SARS-CoV-2 viral entry inhibitors using machine learning and cell-based pseudotyped particle assay.

Author: Sun H, Wang Y, Chen CZ, Xu M, Guo H, Itkin M, Zheng W, and Shen M
Subjects: Area Under Curve, Databases, Chemical statistics & numerical data, Drug Repositioning, HEK293 Cells, Humans, Microbial Sensitivity Tests, ROC Curve, Small Molecule Libraries pharmacology, Antiviral Agents pharmacology, SARS-CoV-2 drug effects, Support Vector Machine statistics & numerical data, Virus Internalization drug effects
Abstract: In response to the pandemic caused by SARS-CoV-2, we constructed a hybrid support vector machine (SVM) classification model using a set of publicly posted SARS-CoV-2 pseudotyped particle (PP) entry assay repurposing screen data to identify novel potent compounds as a starting point for drug development to treat COVID-19 patients. Two different molecular descriptor systems, atom typing descriptors and 3D fingerprints (FPs), were employed to construct the SVM classification models. Both models achieved reasonable performance, with the area under the curve of receiver operating characteristic (AUC-ROC) of 0.84 and 0.82, respectively. The consensus prediction outperformed the two individual models with significantly improved AUC-ROC of 0.91, where the compounds with inconsistent classifications were excluded. The consensus model was then used to screen the 173,898 compounds in the NCATS annotated and diverse chemical libraries. Of the 255 compounds selected for experimental confirmation, 116 compounds exhibited inhibitory activities in the SARS-CoV-2 PP entry assay with IC 50 values ranged between 0.17 µM and 62.2 µM, representing an enrichment factor of 3.2. These 116 active compounds with diverse and novel structures could potentially serve as starting points for chemistry optimization for COVID-19 drug discovery., (Published by Elsevier Ltd.)
Published: 2021
Full Text: View/download PDF

5. Global Assessment of Substituents on the Basis of Analogue Series.

Author: Takeuchi K, Kunimoto R, and Bajorath J
Subjects: Algorithms, Chemistry, Pharmaceutical methods, Databases, Chemical statistics & numerical data, Molecular Structure, Organic Chemicals chemistry, Pharmaceutical Preparations chemistry
Abstract: While bioisosteric replacements have been extensively investigated, comprehensive analyses of R-/functional groups have thus far been rare in medicinal chemistry. We introduce a new analysis concept for the exploration of chemical substituent space that is based upon bioactive analogue series as a source. From ∼24,000 analogue series, more than 19,000 substituents were isolated that were differently distributed. A subset of ∼400 substituent fragments occurred most frequently in different structural contexts. These substituents contained well-known R-groups as well as novel structures. Substitution site-specific replacement and network analysis revealed that chemically similar substituents preferentially occurred at given sites and identified intuitive substitution pathways that can be explored for compound design. Taken together, the results of our analysis provide new insights into substituent space and identify preferred substituents on the basis of analogue series. As a part of our study, all the data reported are made freely available.
Published: 2020
Full Text: View/download PDF

6. A network-based pharmacology study of active compounds and targets of Fritillaria thunbergii against influenza.

Author: Kim M and Kim YB
Subjects: Anthocyanins chemistry, Anthocyanins pharmacokinetics, Antiviral Agents pharmacokinetics, Databases, Chemical statistics & numerical data, Databases, Genetic statistics & numerical data, Humans, Pharmacology methods, Protein Interaction Maps, Sitosterols chemistry, Sitosterols pharmacokinetics, Systems Biology methods, Antiviral Agents chemistry, Fritillaria chemistry, Orthomyxoviridae drug effects
Abstract: Seasonal and pandemic influenza infections are serious threats to public health and the global economy. Since antigenic drift reduces the effectiveness of conventional therapies against the virus, herbal medicine has been proposed as an alternative. Fritillaria thunbergii (FT) have been traditionally used to treat airway inflammatory diseases such as coughs, bronchitis, pneumonia, and fever-based illnesses. Herein, we used a network pharmacology-based strategy to predict potential compounds from Fritillaria thunbergii (FT), target genes, and cellular pathways to better combat influenza and influenza-associated diseases. We identified five compounds, and 47 target genes using a compound-target network (C-T). Two compounds (beta-sitosterol and pelargonidin) and nine target genes (BCL2, CASP3, HSP90AA1, ICAM1, JUN, NOS2, PPARG, PTGS1, PTGS2) were identified using a compound-influenza disease target network (C-D). Protein-protein interaction (PPI) network was constructed and we identified eight proteins from nine target genes formed a network. The compound-disease-pathway network (C-D-P) revealed three classes of pathways linked to influenza: cancer, viral diseases, and inflammation. Taken together, our systems biology data from C-T, C-D, PPI and C-D-P networks predicted potent compounds from FT and new therapeutic targets and pathways involved in influenza., (Copyright © 2020 Elsevier Ltd. All rights reserved.)
Published: 2020
Full Text: View/download PDF

7. Hit identification against peptidyl-prolyl isomerase of Theileria annulata by combined virtual high-throughput screening and molecular dynamics simulation approach.

Author: Spahi S, Mutlu O, Sariyer E, Kocer S, Ugurel E, and Turgut-Balik D
Subjects: Catalytic Domain, Databases, Chemical statistics & numerical data, Enzyme Inhibitors metabolism, High-Throughput Screening Assays, Ligands, Molecular Docking Simulation, Molecular Dynamics Simulation, Mutation, Naphthoquinones metabolism, Peptidylprolyl Isomerase chemistry, Peptidylprolyl Isomerase genetics, Peptidylprolyl Isomerase metabolism, Protein Binding, Proto-Oncogene Mas, Protozoan Proteins chemistry, Protozoan Proteins genetics, Protozoan Proteins metabolism, Enzyme Inhibitors chemistry, Naphthoquinones chemistry, Peptidylprolyl Isomerase antagonists & inhibitors, Protozoan Proteins antagonists & inhibitors, Theileria annulata enzymology
Abstract: Theileria annulata secretes peptidyl prolyl isomerase enzyme (TaPIN1) to manipulate the host cell oncogenic signaling pathway by disrupting the tumor suppressor F-box and WD repeat domain-containing 7 (FBW7) protein level leading to an increased level of c-Jun proto-oncogene. Buparvaquone is a hydroxynaphthoquinone anti-theilerial drug and has been used to treat theileriosis. However, TaPIN1 contains the A53 P mutation that causes drug resistance. In this study, potential TaPIN1 inhibitors were investigated using a library of naphthoquinone derivatives. Comparative models of mutant (m) and wild type (wt) TaPIN1 were predicted and energy minimization was followed by structure validation. A naphthoquinone (hydroxynaphthalene-1,2-dione, hydroxynaphthalene-1,4-dione) and hydroxynaphthalene-2,3-dione library was screened by Schrödinger Glide HTVS, SP and XP docking methodologies and the docked compounds were ranked by the Glide XP scoring function. The two highest ranked docked compounds Compound 1 (4-hydroxy-3-[3,4,5-trihydroxy-6-(hydroxymethyl)oxan-2-yl]oxynaphthalene-1,2-dione) and Compound 2 (6-acetyl-1,4,5,7,8-pentahydroxynaphthalene-2,3-dione) were used for further molecular dynamics (MD) simulation studies. The MD results showed that ligand Compound 1 was located in the active site of both mTaPIN1 and wtTaPIN1 and could be proposed as a potential inhibitor by acting as a substrate antagonist. However, ligand Compound 2 was displaced away from the binding pocket of wtTaPIN1 but was located near the active site binding pocket of mTaPIN1 suggesting that could be selectively evaluated as a potential inhibitor against the mTaPIN1. Compound 1 and Compound 2 ligands are potential inhibitors but Compound 2 is suggested as a better inhibitor for mTaPIN1. These ligands could also further evaluated as potential inhibitors against human peptidyl prolyl isomerase which causes cancer in humans by using the same mechanism as TaPIN1., (Copyright © 2020 Elsevier Ltd. All rights reserved.)
Published: 2020
Full Text: View/download PDF

8. Accurate predictions of aqueous solubility of drug molecules via the multilevel graph convolutional network (MGCN) and SchNet architectures.

Author: Gao P, Zhang J, Sun Y, and Yu J
Subjects: Databases, Chemical statistics & numerical data, Datasets as Topic statistics & numerical data, Solubility, Deep Learning, Pharmaceutical Preparations chemistry, Water chemistry
Abstract: Deep learning based methods have been widely applied to predict various kinds of molecular properties in the pharmaceutical industry with increasingly more success. In this study, we propose two novel models for aqueous solubility predictions, based on the Multilevel Graph Convolutional Network (MGCN) and SchNet architectures, respectively. The advantage of the MGCN lies in the fact that it could extract the graph features of the target molecules directly from the (3D) structural information; therefore, it doesn't need to rely on a lot of intra-molecular descriptors to learn the features, which are of significance for accurate predictions of the molecular properties. The SchNet performs well in modelling the interatomic interactions inside a molecule, and such a deep learning architecture is also capable of extracting structural information and further predicting the related properties. The actual accuracy of these two novel approaches was systematically benchmarked with four different independent datasets. We found that both the MGCN and SchNet models performed well for aqueous solubility predictions. In the future, we believe such promising predictive models will be applicable to enhancing the efficiency of the screening, crystallization and delivery of drug molecules, essentially as a useful tool to promote the development of molecular pharmaceutics.
Published: 2020
Full Text: View/download PDF

9. A Machine Learning Approach for the Automated Interpretation of Plasma Amino Acid Profiles.

Author: Wilkes EH, Emmett E, Beltran L, Woodward GM, and Carling RS
Subjects: Databases, Chemical statistics & numerical data, Humans, Amino Acids blood, Machine Learning
Abstract: Background: Plasma amino acid (PAA) profiles are used in routine clinical practice for the diagnosis and monitoring of inherited disorders of amino acid metabolism, organic acidemias, and urea cycle defects. Interpretation of PAA profiles is complex and requires substantial training and expertise to perform. Given previous demonstrations of the ability of machine learning (ML) algorithms to interpret complex clinical biochemistry data, we sought to determine if ML-derived classifiers could interpret PAA profiles with high predictive performance., Methods: We collected PAA profiling data routinely performed within a clinical biochemistry laboratory (2084 profiles) and developed decision support classifiers with several ML algorithms. We tested the generalization performance of each classifier using a nested cross-validation (CV) procedure and examined the effect of various subsampling, feature selection, and ensemble learning strategies., Results: The classifiers demonstrated excellent predictive performance, with the 3 ML algorithms tested producing comparable results. The best-performing ensemble binary classifier achieved a mean precision-recall (PR) AUC of 0.957 (95% CI 0.952, 0.962) and the best-performing ensemble multiclass classifier achieved a mean F4 score of 0.788 (0.773, 0.803)., Conclusions: This work builds upon previous demonstrations of the utility of ML-derived decision support tools in clinical biochemistry laboratories. Our findings suggest that, pending additional validation studies, such tools could potentially be used in routine clinical practice to streamline and aid the interpretation of PAA profiles. This would be particularly useful in laboratories with limited resources and large workloads. We provide the necessary code for other laboratories to develop their own decision support tools., (© American Association for Clinical Chemistry 2020. All rights reserved. For permissions, please email: journals.permissions@oup.com.)
Published: 2020
Full Text: View/download PDF

10. NPid: an Automatic Approach to Rapid Identification of Known Natural Products in the Crude Extract of Crabapple Based on 2D 1 H- 13 C Heteronuclear Correlation Spectra of the Extract Mixture.

Author: Huang T, Chen P, Liu B, Li X, Lv X, and Hu K
Subjects: Algorithms, Biological Products chemistry, Databases, Chemical statistics & numerical data, Magnetic Resonance Spectroscopy statistics & numerical data, Malus chemistry, Molecular Structure, Plant Extracts chemistry, Proof of Concept Study, Biological Products analysis, Plant Extracts analysis
Abstract: An automatic approach to identification of natural products (NPid) in complex extracts by exploring pure shift HSQC (psHSQC) and H2BC spectra of the mixture is developed, which integrated information on chemical shifts (CS), adjacent relationships (AR) and peak intensities (PI) of 1 H- 13 C groups for identification of candidate natural product in a customized NMR database. A weighted comprehensive score is calculated for each candidate from the values of CS, AR and PI to rate the likelihood of its existence in the complex mixture. Using the crude extract of crabapple ( Malus fusca ) as an example, a customized NMR database of natural products from plants of the genus Malus was constructed. The performance of NPid was first evaluated using simulated data in four scenarios, that is, for identification of structurally similar natural products, identification of natural products with part of peaks missing in psHSQC due to low concentration, without available adjacent relationship information, or without useful peak intensity information. The false positive and false negative rates of the natural products identified by NPid were estimated by Monte Carlo simulation. It shows that AR and PI can effectively reduce the false positive rate of identification. Proof of concept of the proposed method was elucidated on a model mixture consisting of 10 known natural products. Application of this method was then demonstrated on an authentic sample of crude extract of crabapple and 19 known natural products were successfully identified and confirmed by standard spiking.
Published: 2020
Full Text: View/download PDF

11. Focused Library Generator: case of Mdmx inhibitors.

Author: Xia Z, Karpov P, Popowicz G, and Tetko IV
Subjects: Antineoplastic Agents chemistry, Antineoplastic Agents pharmacology, Binding Sites, Cell Cycle Proteins chemistry, Computer-Aided Design statistics & numerical data, Databases, Chemical statistics & numerical data, Databases, Pharmaceutical, Drug Discovery methods, Drug Discovery statistics & numerical data, Humans, Ligands, Molecular Docking Simulation, Molecular Dynamics Simulation, Neural Networks, Computer, Protein Binding, Proto-Oncogene Proteins chemistry, Quantitative Structure-Activity Relationship, Cell Cycle Proteins antagonists & inhibitors, Drug Design, Proto-Oncogene Proteins antagonists & inhibitors, Small Molecule Libraries
Abstract: We present a Focused Library Generator that is able to create from scratch new molecules with desired properties. After training the Generator on the ChEMBL database, transfer learning was used to switch the generator to producing new Mdmx inhibitors that are a promising class of anticancer drugs. Lilly medicinal chemistry filters, molecular docking, and a QSAR IC 50 model were used to refine the output of the Generator. Pharmacophore screening and molecular dynamics (MD) simulations were then used to further select putative ligands. Finally, we identified five promising hits with equivalent or even better predicted binding free energies and IC 50 values than known Mdmx inhibitors. The source code of the project is available on https://github.com/bigchem/online-chem.
Published: 2020
Full Text: View/download PDF

12. Diversifying chemical libraries with generative topographic mapping.

Author: Lin A, Beck B, Horvath D, Marcou G, and Varnek A
Subjects: Algorithms, Computer-Aided Design statistics & numerical data, Databases, Chemical statistics & numerical data, Databases, Pharmaceutical statistics & numerical data, Drug Design, Drug Development statistics & numerical data, Drug Discovery statistics & numerical data, Humans, Molecular Structure, Software, User-Computer Interface, Drug Discovery methods, Small Molecule Libraries
Abstract: Generative topographic mapping was used to investigate the possibility to diversify the in-house compounds collection of Boehringer Ingelheim (BI). For this purpose, a 2D map covering the relevant chemical space was trained, and the BI compound library was compared to the Aldrich-Market Select (AMS) database of more than 8M purchasable compounds. In order to discover new (sub)structures, the "AutoZoom" tool was developed and applied in order to analyze chemotypes of molecules residing in heavily populated zones of a map and to extract the corresponding maximum common substructures. A set of 401K new structures from the AMS database was retrieved and checked for drug-likeness and biological activity.
Published: 2020
Full Text: View/download PDF

13. Conditional Prediction of Ribonucleic Acid Secondary Structure Using Chemical Shifts.

Author: Zhang K and Frank AT
Subjects: Algorithms, Base Pairing, Databases, Chemical statistics & numerical data, Machine Learning, Neural Networks, Computer, Nuclear Magnetic Resonance, Biomolecular, Nucleic Acid Conformation, RNA chemistry
Abstract: Inspired by methods that utilize chemical-mapping data to guide secondary structure prediction, we sought to develop a framework for using assigned chemical shift data to guide ribonucleic acid (RNA) secondary structure prediction. We first used machine learning to develop classifiers that predict the base-pairing status of individual residues in an RNA based on their assigned chemical shifts. Then, we used these base-pairing status predictions as restraints to guide RNA folding algorithms. Our results showed that we could recover the correct secondary fold of most of the 108 RNAs in our data set with remarkable accuracy. Finally, we tested whether we could use the base-pairing status predictions that we obtained from assigned chemical shift data to conditionally predict the secondary structure of RNA. To achieve this, we attempted to model two distinct conformational states of the microRNA-20b and the fluoride riboswitch using assigned chemical shifts that were available for both conformational states of each of these test RNAs. For both test cases, we found that by using the base-pairing status predictions that we obtained from assigned chemical shift data as folding restraints, we could generate structures that closely resembled the known structure of the two distinct states. A command-line tool for chemical shifts to base-pairing status predictions in RNA has been incorporated into our CS2Structure Git repository and can be accessed via https://github.com/atfrank/CS2Structure .
Published: 2020
Full Text: View/download PDF

14. CapsCarcino: A novel sparse data deep learning tool for predicting carcinogens.

Author: Wang YW, Huang L, Jiang SW, Li K, Zou J, and Yang SY
Subjects: Animals, Databases, Chemical statistics & numerical data, Rats, Carcinogens chemistry, Deep Learning
Abstract: Determining chemical carcinogenicity in the early stages of drug discovery is fundamentally important to prevent the adverse effect of carcinogens on human health. There has been a recent surge of interest in developing computational approaches to predict chemical carcinogenicity. However, the predictive power of many existing approaches is limited, and there is plenty of room for improvement. Here, we develop a new deep learning architecture, termed CapsCarcino, to distinguish between carcinogens and noncarcinogens. CapsCarcino is constructed based on a dynamic routing algorithm that requires less data, extracts more comprehensive information, and does not require feature selection. We find that CapsCarcino provides a significantly improved predictive and generalization ability over, and outperforms five other machine learning models. Specifically, the best model of CapsCarcino achieves an accuracy of 85.0% on an external validation dataset. In addition, we discover that the enhanced predictive capability of CapsCarcino over that of the other methods is robust and can be achieved using sparse datasets. Training on merely 20% of the dataset, CapsCarcino performs comparably to the other methods based on the full training dataset. Further mechanism analysis indicates that CapsCarcino could efficiently learn the characteristics of carcinogens even if structural alerts are insufficiently represented. The results indicate that CapsCarcino should be helpful for carcinogen risk assessment., (Copyright © 2019 Elsevier Ltd. All rights reserved.)
Published: 2020
Full Text: View/download PDF

15. Predicting drug-target interaction network using deep learning model.

Author: You J, McLeod RD, and Hu P
Subjects: Amino Acid Sequence, Antineoplastic Agents chemistry, Breast Neoplasms genetics, Computational Biology methods, Databases, Chemical statistics & numerical data, Databases, Protein statistics & numerical data, Drug Repositioning, Genes, Neoplasm drug effects, Molecular Structure, Protein Binding, Protein Domains, Proteins chemistry, Support Vector Machine, Antineoplastic Agents metabolism, Deep Learning, Models, Chemical, Proteins metabolism
Abstract: Background: Traditional methods for drug discovery are time-consuming and expensive, so efforts are being made to repurpose existing drugs. To find new ways for drug repurposing, many computational approaches have been proposed to predict drug-target interactions (DTIs). However, due to the high-dimensional nature of the data sets extracted from drugs and targets, traditional machine learning approaches, such as logistic regression analysis, cannot analyze these data sets efficiently. To overcome this issue, we propose LASSO (Least absolute shrinkage and selection operator)-based regularized linear classification models and a LASSO-DNN (Deep Neural Network) model based on LASSO feature selection to predict DTIs. These methods are demonstrated for repurposing drugs for breast cancer treatment., Methods: We collected drug descriptors, protein sequence data from Drugbank and protein domain information from NCBI. Validated DTIs were downloaded from Drugbank. A new similarity-based approach was developed to build the negative DTIs. We proposed multiple LASSO models to integrate different combinations of feature sets to explore the prediction power and predict DTIs. Furthermore, building on the features extracted from the LASSO models with the best performance, we also introduced a LASSO-DNN model to predict DTIs. The performance of our newly proposed DNN model (LASSO-DNN) was compared with the LASSO, standard logistic (SLG) regression, support vector machine (SVM), and standard DNN models., Results: Experimental results showed that the LASSO-DNN over performed the SLG, LASSO, SVM and standard DNN models. In particular, the LASSO models with protein tripeptide composition (TC) features and domain features were superior to those that contained other protein information, which may imply that TC and domain information could be better representations of proteins. Furthermore, we showed that the top ranked DTIs predicted using the LASSO-DNN model can potentially be used for repurposing existing drugs for breast cancer based on risk gene information., Conclusions: In summary, we demonstrated that the efficient representations of drug and target features are key for building learning models for predicting DTIs. The disease-associated risk genes identified from large-scale genomic studies are the potential drug targets, which can be used for drug repurposing., (Copyright © 2019 Elsevier Ltd. All rights reserved.)
Published: 2019
Full Text: View/download PDF

16. Identification of novel Plasmodium falciparum PI4KB inhibitors as potential anti-malarial drugs: Homology modeling, molecular docking and molecular dynamics simulations.

Author: Ibrahim MAA, Abdelrahman AHM, and Hassan AMA
Subjects: 1-Phosphatidylinositol 4-Kinase chemistry, Amino Acid Sequence, Antimalarials chemistry, Catalytic Domain, Databases, Chemical statistics & numerical data, Drug Discovery, Ligands, Molecular Docking Simulation, Molecular Dynamics Simulation, Molecular Structure, Protein Binding, Protein Kinase Inhibitors chemistry, Sequence Alignment, 1-Phosphatidylinositol 4-Kinase antagonists & inhibitors, 1-Phosphatidylinositol 4-Kinase metabolism, Antimalarials metabolism, Plasmodium falciparum enzymology, Protein Kinase Inhibitors metabolism
Abstract: The current study was set to discover selective Plasmodium falciparum phosphatidylinositol-4-OH kinase type III beta (pfPI4KB) inhibitors as potential antimalarial agents using combined structure-based and ligand-based drug discovery approach. A comparative model of pfPI4KB was first constructed and validated using molecular docking techniques. Performance of Autodock4.2 and Vina4 software in predicting the inhibitor-PI4KB binding mode and energy was assessed based on two Test Sets: Test Set I contained five ligands with resolved crystal structures with PI4KB, while Test Set II considered eleven compounds with known IC50 value towards PI4KB. The outperformance of Autodock as compared to Vina was reported, giving a correlation coefficient (R 2 ) value of 0.87 and 0.90 for Test Set I and Test Set II, respectively. Pharmacophore-based screening was then conducted to identify drug-like molecules from ZINC database with physicochemical similarity to two potent pfPI4KB inhibitors -namely cpa and cpb. For each query inhibitor, the best 1000 hits in terms of TanimotoCombo scores were selected and subjected to molecular docking and molecular dynamics (MD) calculations. Binding energy was then estimated using molecular mechanics-generalized Born surface area (MM-GBSA) approach over 50 ns MD simulations of the inhibitor-pfPI4KB complexes. According to the calculated MM-GBSA binding energies, ZINC78988474 and ZINC20564116 were identified as potent pfPI4KB inhibitors with binding energies better than those of cpa and cpb, with ΔG binding ≥ -34.56 kcal/mol. The inhibitor-pfPI4KB interaction and stability were examined over 50 ns MD simulation; as well the selectivity of the identified inhibitors towards pfPI4KB over PI4KB was reported., (Copyright © 2019. Published by Elsevier Ltd.)
Published: 2019
Full Text: View/download PDF

17. Evaluation of an Artificial Neural Network Retention Index Model for Chemical Structure Identification in Nontargeted Metabolomics.

Author: Samaraweera MA, Hall LM, Hill DW, and Grant DF
Subjects: Algorithms, Chromatography, Liquid, Computer Simulation, Molecular Structure, Spectrometry, Mass, Electrospray Ionization, Databases, Chemical statistics & numerical data, Metabolomics methods, Models, Chemical, Neural Networks, Computer
Abstract: Liquid chromatography coupled with electrospray ionization tandem mass spectrometry (LC-ESI-MS/MS) is a major analytical technique used for nontargeted identification of metabolites in biological fluids. Typically, in LC-ESI-MS/MS based database assisted structure elucidation pipelines, the exact mass of an unknown compound is used to mine a chemical structure database to acquire an initial set of possible candidates. Subsequent matching of the collision induced dissociation (CID) spectrum of the unknown to the CID spectra of candidate structures facilitates identification. However, this approach often fails because of the large numbers of potential candidates (i.e., false positives) for which CID spectra are not available. To overcome this problem, CID fragmentation predication programs have been developed, but these also have limited success if large numbers of isomers with similar CID spectra are present in the candidate set. In this study, we investigated the use of a retention index (RI) predictive model as an orthogonal method to help improve identification rates. The model was used to eliminate candidate structures whose predicted RI values differed significantly from the experimentally determined RI value of the unknown compound. We tested this approach using a set of ninety-one endogenous metabolites and four in silico CID fragmentation algorithms: CFM-ID, CSI:FingerID, Mass Frontier, and MetFrag. Candidate sets obtained from PubChem and the Human Metabolite Database (HMDB) were ranked with and without RI filtering followed by in silico spectral matching. Upon RI filtering, 12 of the ninety-one metabolites were eliminated from their respective candidate sets, i.e., were scored incorrectly as negatives. For the remaining seventy-nine compounds, we show that RI filtering eliminated an average of 58% from PubChem candidate sets. This resulted in an approximately 2-fold improvement in average rankings when using CFM-ID, Mass Frontier, and MetFrag. In addition, RI filtering slightly increased the occurrence of number one rankings for all 4 fragmentation algorithms. However, RI filtering did not significantly improve average rankings when HMDB was used as the candidate database, nor did it significantly improve average rankings when using CSI:FingerID. Overall, we show that the current RI model incorrectly eliminated more true positives (12) than were expected (4-5) on the basis of the filtering method. However, it slightly improved the number of correct first place rankings and improved overall average rankings when using CFM-ID, Mass Frontier, and MetFrag.
Published: 2018
Full Text: View/download PDF

18. In Silico Prediction of Blood-Brain Barrier Permeability of Compounds by Machine Learning and Resampling Methods.

Author: Wang Z, Yang H, Wu Z, Wang T, Li W, Tang Y, and Liu G
Subjects: Algorithms, Models, Chemical, Organic Chemicals chemistry, Permeability, Blood-Brain Barrier metabolism, Computer Simulation, Databases, Chemical statistics & numerical data, Organic Chemicals pharmacokinetics, Support Vector Machine
Abstract: The blood-brain barrier (BBB) as a part of absorption protects the central nervous system by separating the brain tissue from the bloodstream. In recent years, BBB permeability has become a critical issue in chemical ADMET prediction, but almost all models were built using imbalanced data sets, which caused a high false-positive rate. Therefore, we tried to solve the problem of biased data sets and built a reliable classification model with 2358 compounds. Machine learning and resampling methods were used simultaneously for the refinement of models with both 2 D molecular descriptors and molecular fingerprints to represent the chemicals. Through a series of evaluation, we realized that resampling methods such as Synthetic Minority Oversampling Technique (SMOTE) and SMOTE+edited nearest neighbor could effectively solve the problem of imbalanced data sets and that MACCS fingerprint combined with support vector machine performed the best. After the final construction of a consensus model, the overall accuracy rate was increased to 0.966 for the final external data set. Also, the accuracy rate of the model for the test set was 0.919, with an excellent balanced capacity of 0.925 (sensitivity) to predict BBB-positive compounds and of 0.899 (specificity) to predict BBB-negative compounds. Compared with other BBB classification models, our models reduced the rate of false positives and were more robust in prediction of BBB-positive as well as BBB-negative compounds, which would be quite helpful in early drug discovery., (© 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.)
Published: 2018
Full Text: View/download PDF

19. Modelling methods and cross-validation variants in QSAR: a multi-level analysis $ .

Author: Rácz A, Bajusz D, and Héberger K
Subjects: Analysis of Variance, Databases, Chemical statistics & numerical data, Toxicity Tests statistics & numerical data, Drug Discovery methods, Models, Molecular, Quantitative Structure-Activity Relationship
Abstract: Prediction performance often depends on the cross- and test validation protocols applied. Several combinations of different cross-validation variants and model-building techniques were used to reveal their complexity. Two case studies (acute toxicity data) were examined, applying five-fold cross-validation (with random, contiguous and Venetian blind forms) and leave-one-out cross-validation (CV). External test sets showed the effects and differences between the validation protocols. The models were generated with multiple linear regression (MLR), principal component regression (PCR), partial least squares (PLS) regression, artificial neural networks (ANN) and support vector machines (SVM). The comparisons were made by the sum of ranking differences (SRD) and factorial analysis of variance (ANOVA). The largest bias and variance could be assigned to the MLR method and contiguous block cross-validation. SRD can provide a unique and unambiguous ranking of methods and CV variants. Venetian blind cross-validation is a promising tool. The generated models were also compared based on their basic performance parameters (r 2 and Q 2 ). MLR produced the largest gap, while PCR gave the smallest. Although PCR is the best validated and balanced technique, SVM always outperformed the other methods, when experimental values were the benchmark. Variable selection was advantageous, and the modelling had a larger influence than CV variants.
Published: 2018
Full Text: View/download PDF

20. NMReDATA, a standard to report the NMR assignment and parameters of organic compounds.

Author: Pupier M, Nuzillard JM, Wist J, Schlörer NE, Kuhn S, Erdelyi M, Steinbeck C, Williams AJ, Butts C, Claridge TDW, Mikhova B, Robien W, Dashti H, Eghbalnia HR, Farès C, Adam C, Kessler P, Moriaud F, Elyashberg M, Argyropoulos D, Pérez M, Giraudeau P, Gil RR, Trevorrow P, and Jeannerat D
Subjects: Databases, Chemical statistics & numerical data, Software standards, Information Storage and Retrieval standards, Magnetic Resonance Spectroscopy statistics & numerical data, Organic Chemicals chemistry
Abstract: Even though NMR has found countless applications in the field of small molecule characterization, there is no standard file format available for the NMR data relevant to structure characterization of small molecules. A new format is therefore introduced to associate the NMR parameters extracted from 1D and 2D spectra of organic compounds to the proposed chemical structure. These NMR parameters, which we shall call NMReDATA (for nuclear magnetic resonance extracted data), include chemical shift values, signal integrals, intensities, multiplicities, scalar coupling constants, lists of 2D correlations, relaxation times, and diffusion rates. The file format is an extension of the existing Structure Data Format, which is compatible with the commonly used MOL format. The association of an NMReDATA file with the raw and spectral data from which it originates constitutes an NMR record. This format is easily readable by humans and computers and provides a simple and efficient way for disseminating results of structural chemistry investigations, allowing automatic verification of published results, and for assisting the constitution of highly needed open-source structural databases., (Copyright © 2018 John Wiley & Sons, Ltd.)
Published: 2018
Full Text: View/download PDF

21. Digital chemical test impresses.

Author: Zainzinger V
Subjects: Animals, Computers, Animal Use Alternatives methods, Databases, Chemical statistics & numerical data, Toxicity Tests methods
Published: 2018
Full Text: View/download PDF

22. METLIN: A Technology Platform for Identifying Knowns and Unknowns.

Author: Guijas C, Montenegro-Burke JR, Domingo-Almenara X, Palermo A, Warth B, Hermann G, Koellensperger G, Huan T, Uritboonthai W, Aisporna AE, Wolan DW, Spilker ME, Benton HP, and Siuzdak G
Subjects: Metabolomics methods, Metabolomics statistics & numerical data, Pichia chemistry, Pichia metabolism, Tandem Mass Spectrometry statistics & numerical data, Cell Extracts analysis, Databases, Chemical statistics & numerical data, Datasets as Topic statistics & numerical data
Abstract: METLIN originated as a database to characterize known metabolites and has since expanded into a technology platform for the identification of known and unknown metabolites and other chemical entities. Through this effort it has become a comprehensive resource containing over 1 million molecules including lipids, amino acids, carbohydrates, toxins, small peptides, and natural products, among other classes. METLIN's high-resolution tandem mass spectrometry (MS/MS) database, which plays a key role in the identification process, has data generated from both reference standards and their labeled stable isotope analogues, facilitated by METLIN-guided analysis of isotope-labeled microorganisms. The MS/MS data, coupled with the fragment similarity search function, expand the tool's capabilities into the identification of unknowns. Fragment similarity search is performed independent of the precursor mass, relying solely on the fragment ions to identify similar structures within the database. Stable isotope data also facilitate characterization by coupling the similarity search output with the isotopic m/ z shifts. Examples of both are demonstrated here with the characterization of four previously unknown metabolites. METLIN also now features in silico MS/MS data, which has been made possible through the creation of algorithms trained on METLIN's MS/MS data from both standards and their isotope analogues. With these informatic and experimental data features, METLIN is being designed to address the characterization of known and unknown molecules.
Published: 2018
Full Text: View/download PDF

23. Pink-beam serial crystallography.

Author: Meents A, Wiedorn MO, Srajer V, Henning R, Sarrou I, Bergtholdt J, Barthelmess M, Reinke PYA, Dierksmeyer D, Tolstikova A, Schaible S, Messerschmidt M, Ogata CM, Kissick DJ, Taft MH, Manstein DJ, Lieske J, Oberthuer D, Fischetti RF, and Chapman HN
Subjects: Crystallography, X-Ray instrumentation, Crystallography, X-Ray statistics & numerical data, Databases, Chemical statistics & numerical data, Endopeptidase K chemistry, Equipment Design, Models, Molecular, Phycocyanin chemistry, Protein Conformation, Static Electricity, Synchrotrons, X-Ray Diffraction, Crystallography, X-Ray methods
Abstract: Serial X-ray crystallography allows macromolecular structure determination at both X-ray free electron lasers (XFELs) and, more recently, synchrotron sources. The time resolution for serial synchrotron crystallography experiments has been limited to millisecond timescales with monochromatic beams. The polychromatic, "pink", beam provides a more than two orders of magnitude increased photon flux and hence allows accessing much shorter timescales in diffraction experiments at synchrotron sources. Here we report the structure determination of two different protein samples by merging pink-beam diffraction patterns from many crystals, each collected with a single 100 ps X-ray pulse exposure per crystal using a setup optimized for very low scattering background. In contrast to experiments with monochromatic radiation, data from only 50 crystals were required to obtain complete datasets. The high quality of the diffraction data highlights the potential of this method for studying irreversible reactions at sub-microsecond timescales using high-brightness X-ray facilities.
Published: 2017
Full Text: View/download PDF

24. Customized Consensus Spectral Library Building for Untargeted Quantitative Metabolomics Analysis with Data Independent Acquisition Mass Spectrometry and MetaboDIA Workflow.

Author: Chen G, Walmsley S, Cheung GCM, Chen L, Cheng CY, Beuerman RW, Wong TY, Zhou L, and Choi H
Subjects: Aged, Chlorophyta chemistry, Female, Humans, Male, Metabolomics statistics & numerical data, Workflow, Computational Biology methods, Databases, Chemical statistics & numerical data, Metabolome, Metabolomics methods, Tandem Mass Spectrometry methods
Abstract: Data independent acquisition-mass spectrometry (DIA-MS) coupled with liquid chromatography is a promising approach for rapid, automatic sampling of MS/MS data in untargeted metabolomics. However, wide isolation windows in DIA-MS generate MS/MS spectra containing a mixed population of fragment ions together with their precursor ions. This precursor-fragment ion map in a comprehensive MS/MS spectral library is crucial for relative quantification of fragment ions uniquely representative of each precursor ion. However, existing reference libraries are not sufficient for this purpose since the fragmentation patterns of small molecules can vary in different instrument setups. Here we developed a bioinformatics workflow called MetaboDIA to build customized MS/MS spectral libraries using a user's own data dependent acquisition (DDA) data and to perform MS/MS-based quantification with DIA data, thus complementing conventional MS1-based quantification. MetaboDIA also allows users to build a spectral library directly from DIA data in studies of a large sample size. Using a marine algae data set, we show that quantification of fragment ions extracted with a customized MS/MS library can provide as reliable quantitative data as the direct quantification of precursor ions based on MS1 data. To test its applicability in complex samples, we applied MetaboDIA to a clinical serum metabolomics data set, where we built a DDA-based spectral library containing consensus spectra for 1829 compounds. We performed fragment ion quantification using DIA data using this library, yielding sensitive differential expression analysis.
Published: 2017
Full Text: View/download PDF

25. Technically Extended MultiParameter Optimization (TEMPO): An Advanced Robust Scoring Scheme To Calculate Central Nervous System Druggability and Monitor Lead Optimization.

Author: Ghose AK, Ott GR, and Hudkins RL
Subjects: Animals, Central Nervous System metabolism, Chemical Phenomena, Databases, Chemical statistics & numerical data, Humans, Monitoring, Physiologic, Central Nervous System drug effects, Central Nervous System Agents chemistry, Central Nervous System Agents pharmacology, Drug Design, Models, Chemical
Abstract: At the discovery stage, it is important to understand the drug design concepts for a CNS drug compared to those for a non-CNS drug. Previously, we published on ideal CNS drug property space and defined in detail the physicochemical property distribution of CNS versus non-CNS oral drugs, the application of radar charting (a graphical representation of multiple physicochemical properties used during CNS lead optimization), and a recursive partition classification tree to differentiate between CNS- and non-CNS drugs. The objective of the present study was to further understand the differentiation of physicochemical properties between CNS and non-CNS oral drugs by the development and application of a new CNS scoring scheme: Technically Extended MultiParameter Optimization (TEMPO). In this multiparameter method, we identified eight key physicochemical properties critical for accurately assessing CNS druggability: (1) number of basic amines, (2) carbon-heteroatom (non-carbon, non-hydrogen) ratio, (3) number of aromatic rings, (4) number of chains, (5) number of rotatable bonds, (6) number of H-acceptors, (7) computed octanol/water partition coefficient (AlogP), and (8) number of nonconjugated C atoms in nonaromatic rings. Significant features of the CNS-TEMPO penalty score are the extension of the multiparameter approach to generate an accurate weight factor for each physicochemical property, the use of limits on both sides of the computed property space range during the penalty calculation, and the classification of CNS and non-CNS drug scores. CNS-TEMPO significantly outperformed CNS-MPO and the Schrödinger QikProp CNS parameter (QP_CNS) in evaluating CNS drugs and has been extensively applied in support of CNS lead optimization programs.
Published: 2017
Full Text: View/download PDF

26. LOBSTAHS: An Adduct-Based Lipidomics Strategy for Discovery and Identification of Oxidative Stress Biomarkers.

Author: Collins JR, Edwards BR, Fredricks HF, and Van Mooy BA
Subjects: Biomarkers chemistry, Biomarkers metabolism, Chromatography, Liquid, Databases, Chemical statistics & numerical data, Diatoms chemistry, Hydrogen Peroxide adverse effects, Isomerism, Lipid Metabolism, Lipids chemistry, Mass Spectrometry, Oxidative Stress drug effects, Oxylipins chemistry, Oxylipins metabolism, Biomarkers analysis, High-Throughput Screening Assays methods, Lipids analysis, Oxylipins analysis
Abstract: Discovery and identification of molecular biomarkers in large LC/MS data sets requires significant automation without loss of accuracy in the compound screening and annotation process. Here, we describe a lipidomics workflow and open-source software package for high-throughput annotation and putative identification of lipid, oxidized lipid, and oxylipin biomarkers in high-mass-accuracy HPLC-MS data. Lipid and oxylipin biomarker screening through adduct hierarchy sequences, or LOBSTAHS, uses orthogonal screening criteria based on adduct ion formation patterns and other properties to identify thousands of compounds while providing the user with a confidence score for each assignment. Assignments are made from one of two customizable databases; the default databases contain 14 068 unique entries. To demonstrate the software's functionality, we screened more than 340 000 mass spectral features from an experiment in which hydrogen peroxide was used to induce oxidative stress in the marine diatom Phaeodactylum tricornutum. LOBSTAHS putatively identified 1969 unique parent compounds in 21 869 features that survived the multistage screening process. While P. tricornutum maintained more than 92% of its core lipidome under oxidative stress, patterns in biomarker distribution and abundance indicated remodeling was both subtle and pervasive. Treatment with 150 μM H2O2 promoted statistically significant carbon-chain elongation across lipid classes, with the strongest elongation accompanying oxidation in moieties of monogalactosyldiacylglycerol, a lipid typically localized to the chloroplast. Oxidative stress also induced a pronounced reallocation of lipidome peak area to triacylglycerols. LOBSTAHS can be used with environmental or experimental data from a variety of systems and is freely available at https://github.com/vanmooylipidomics/LOBSTAHS .
Published: 2016
Full Text: View/download PDF

27. Bayesian integrated testing strategy (ITS) for skin sensitization potency assessment: a decision support system for quantitative weight of evidence and adaptive testing strategy.

Author: Jaworska JS, Natsch A, Ryan C, Strickland J, Ashikaga T, and Miyazawa M
Subjects: Animal Testing Alternatives methods, Animals, Bias, Databases, Chemical statistics & numerical data, Humans, Local Lymph Node Assay, Reproducibility of Results, Risk Assessment methods, Bayes Theorem, Decision Support Techniques, Dermatitis, Allergic Contact etiology, Skin Tests methods
Abstract: The presented Bayesian network Integrated Testing Strategy (ITS-3) for skin sensitization potency assessment is a decision support system for a risk assessor that provides quantitative weight of evidence, leading to a mechanistically interpretable potency hypothesis, and formulates adaptive testing strategy for a chemical. The system was constructed with an aim to improve precision and accuracy for predicting LLNA potency beyond ITS-2 (Jaworska et al., J Appl Toxicol 33(11):1353-1364, 2013) by improving representation of chemistry and biology. Among novel elements are corrections for bioavailability both in vivo and in vitro as well as consideration of the individual assays' applicability domains in the prediction process. In ITS-3 structure, three validated alternative assays, DPRA, KeratinoSens and h-CLAT, represent first three key events of the adverse outcome pathway for skin sensitization. The skin sensitization potency prediction is provided as a probability distribution over four potency classes. The probability distribution is converted to Bayes factors to: 1) remove prediction bias introduced by the training set potency distribution and 2) express uncertainty in a quantitative manner, allowing transparent and consistent criteria to accept a prediction. The novel ITS-3 database includes 207 chemicals with a full set of in vivo and in vitro data. The accuracy for predicting LLNA outcomes on the external test set (n = 60) was as follows: hazard (two classes)-100 %, GHS potency classification (three classes)-96 %, potency (four classes)-89 %. This work demonstrates that skin sensitization potency prediction based on data from three key events, and often less, is possible, reliable over broad chemical classes and ready for practical applications.
Published: 2015
Full Text: View/download PDF

28. Identification of Chemical Toxicity Using Ontology Information of Chemicals.

Author: Jiang Z, Xu R, and Dong C
Subjects: Combinatorial Chemistry Techniques, Computational Biology, Databases, Chemical statistics & numerical data, Drug Design, Humans, Knowledge Bases, Drug Discovery, Drug-Related Side Effects and Adverse Reactions
Abstract: With the advance of the combinatorial chemistry, a large number of synthetic compounds have surged. However, we have limited knowledge about them. On the other hand, the speed of designing new drugs is very slow. One of the key causes is the unacceptable toxicities of chemicals. If one can correctly identify the toxicity of chemicals, the unsuitable chemicals can be discarded in early stage, thereby accelerating the study of new drugs and reducing the R&D costs. In this study, a new prediction method was built for identification of chemical toxicities, which was based on ontology information of chemicals. By comparing to a previous method, our method is quite effective. We hope that the proposed method may give new insights to study chemical toxicity and other attributes of chemicals.
Published: 2015
Full Text: View/download PDF

29. MOSAIC: a data model and file formats for molecular simulations.

Author: Hinsen K
Subjects: Computational Biology, Databases, Chemical statistics & numerical data, Molecular Conformation, Molecular Structure, Models, Molecular, Molecular Dynamics Simulation statistics & numerical data, Software
Abstract: The MOlecular SimulAtion Interchange Conventions (MOSAIC) consist of a data model for molecular simulations and of concrete implementations of this data model in the form of file formats. MOSAIC is designed as a modular set of specifications, of which the initial version covers molecular structure and configurations. A reference implementation in the Python language facilitates the development of simulation software based on MOSAIC.
Published: 2014
Full Text: View/download PDF

30. Data set modelability by QSAR.

Author: Golbraikh A, Muratov E, Fourches D, and Tropsha A
Subjects: Computational Biology, Drug Design, Databases, Chemical statistics & numerical data, Models, Chemical, Quantitative Structure-Activity Relationship
Abstract: We introduce a simple MODelability Index (MODI) that estimates the feasibility of obtaining predictive QSAR models (correct classification rate above 0.7) for a binary data set of bioactive compounds. MODI is defined as an activity class-weighted ratio of the number of nearest-neighbor pairs of compounds with the same activity class versus the total number of pairs. The MODI values were calculated for more than 100 data sets, and the threshold of 0.65 was found to separate the nonmodelable and modelable data sets.
Published: 2014
Full Text: View/download PDF

31. Chromosome 19p in Alzheimer's disease: when genome meets transcriptome.

Author: Wang J, Feng X, Bai Z, Jin LW, Duan Y, and Lei H
Subjects: Alzheimer Disease pathology, Brain metabolism, Databases, Chemical statistics & numerical data, Gene Expression, Gene Regulatory Networks, Humans, Microarray Analysis, Alzheimer Disease genetics, Chromosomes, Human, Pair 19 genetics, Genetic Predisposition to Disease, Transcriptome genetics
Abstract: Genetic studies have identified several genomic loci including chr19p13.2 relevant to Alzheimer's disease (AD) susceptibility. However, the functional roles of these genomic loci in AD pathogenesis require further clarification. Transcriptome as an endophenotype is critical for the understanding of disease mechanism. Here we demonstrate that chr19p is the most significantly perturbed chromosome region in AD brain transcriptome. With dual evidence from genome and transcriptome, chr19p likely play a special role in AD pathogenesis.
Published: 2014
Full Text: View/download PDF

32. What is the likelihood of an active compound to be promiscuous? Systematic assessment of compound promiscuity on the basis of PubChem confirmatory bioassay data.

Author: Hu Y and Bajorath J
Subjects: Biological Assay statistics & numerical data, Structure-Activity Relationship, Databases, Chemical statistics & numerical data, Databases, Factual statistics & numerical data, Pharmaceutical Preparations chemistry, Probability
Abstract: Compound promiscuity refers to the ability of small molecules to specifically interact with multiple targets, which represents the origin of polypharmacology. Promiscuity is thought to be a widespread characteristic of pharmaceutically relevant compounds. Yet, the degree of promiscuity among active compounds from different sources remains uncertain. Here, we report a thorough analysis of compound promiscuity on the basis of more than 1,000 PubChem confirmatory bioassays, which yields an upper-limit assessment of promiscuity among active compounds. Because most PubChem compounds have been tested in large numbers of assays, data sparseness has not been a limiting factor for the current analysis. We have determined that there is an overall likelihood of ∼50% of an active PubChem compound to interact with two or more targets. The probability to interact with more than five targets is reduced to 7.6%. On average, an active PubChem compound was found to interact with ∼2.5 targets. Moreover, if only activities consistently detected in all assays available for a given target were considered, this ratio was further reduced to ∼2.3 targets per compound. For comparison, we have also analyzed high-confidence activity data from ChEMBL, the major public repository of compounds from medicinal chemistry, and determined that an active ChEMBL compound interacted on average with only ∼1.5 targets. Taken together, our results indicate that the degree of compound promiscuity is lower than often assumed.
Published: 2013
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

32 results on '"Databases, Chemical statistics & numerical data"'

1. Major chemical database investigates hundreds of suspicious crystal structures.

2. Epigenetic Target Fishing with Accurate Machine Learning Models.

3. Development of an a priori computational approach for brain uptake of compounds in an insect model system.

4. Identification of SARS-CoV-2 viral entry inhibitors using machine learning and cell-based pseudotyped particle assay.

5. Global Assessment of Substituents on the Basis of Analogue Series.

6. A network-based pharmacology study of active compounds and targets of Fritillaria thunbergii against influenza.

7. Hit identification against peptidyl-prolyl isomerase of Theileria annulata by combined virtual high-throughput screening and molecular dynamics simulation approach.

8. Accurate predictions of aqueous solubility of drug molecules via the multilevel graph convolutional network (MGCN) and SchNet architectures.

9. A Machine Learning Approach for the Automated Interpretation of Plasma Amino Acid Profiles.

10. NPid: an Automatic Approach to Rapid Identification of Known Natural Products in the Crude Extract of Crabapple Based on 2D 1 H- 13 C Heteronuclear Correlation Spectra of the Extract Mixture.

11. Focused Library Generator: case of Mdmx inhibitors.

12. Diversifying chemical libraries with generative topographic mapping.

13. Conditional Prediction of Ribonucleic Acid Secondary Structure Using Chemical Shifts.

14. CapsCarcino: A novel sparse data deep learning tool for predicting carcinogens.

15. Predicting drug-target interaction network using deep learning model.

16. Identification of novel Plasmodium falciparum PI4KB inhibitors as potential anti-malarial drugs: Homology modeling, molecular docking and molecular dynamics simulations.

17. Evaluation of an Artificial Neural Network Retention Index Model for Chemical Structure Identification in Nontargeted Metabolomics.

18. In Silico Prediction of Blood-Brain Barrier Permeability of Compounds by Machine Learning and Resampling Methods.

19. Modelling methods and cross-validation variants in QSAR: a multi-level analysis $ .

20. NMReDATA, a standard to report the NMR assignment and parameters of organic compounds.

21. Digital chemical test impresses.

22. METLIN: A Technology Platform for Identifying Knowns and Unknowns.

23. Pink-beam serial crystallography.

24. Customized Consensus Spectral Library Building for Untargeted Quantitative Metabolomics Analysis with Data Independent Acquisition Mass Spectrometry and MetaboDIA Workflow.

25. Technically Extended MultiParameter Optimization (TEMPO): An Advanced Robust Scoring Scheme To Calculate Central Nervous System Druggability and Monitor Lead Optimization.

26. LOBSTAHS: An Adduct-Based Lipidomics Strategy for Discovery and Identification of Oxidative Stress Biomarkers.

27. Bayesian integrated testing strategy (ITS) for skin sensitization potency assessment: a decision support system for quantitative weight of evidence and adaptive testing strategy.

28. Identification of Chemical Toxicity Using Ontology Information of Chemicals.

29. MOSAIC: a data model and file formats for molecular simulations.

30. Data set modelability by QSAR.

31. Chromosome 19p in Alzheimer's disease: when genome meets transcriptome.

32. What is the likelihood of an active compound to be promiscuous? Systematic assessment of compound promiscuity on the basis of PubChem confirmatory bioassay data.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

32 results on '"Databases, Chemical statistics & numerical data"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources