182 results on '"Green WH"'
Search Results
2. Unravelling the role of cyclopentadiene and hexadiene in the formation of aromatics
- Author
-
Vermeire, Florence, Van Geem, Kevin, De Bruycker, Ruben, Herbinet, O, Carstensen, Hans-Heinrich, Battin-Leclerc, F, Mershant, SS, Green, WH, and Marin, Guy
- Published
- 2016
3. An experimental and theoretical study of cyclopentadiene-ethene co-pyrolysis: growth of polycyclic aromatic hydrocarbons
- Author
-
Djokic, Marko, Vandeputte, Aäron, Merchant, SS, Green, WH, Van Geem, Kevin, and Marin, Guy
- Subjects
Technology and Engineering - Published
- 2015
4. Superficial x-ray in the treatment of basal and squamous cell carcinomas: A viable option in select patients.
- Author
-
Cognetta AB, Howard BM, Heaton HP, Stoddard ER, Hong HG, and Green WH
- Published
- 2012
- Full Text
- View/download PDF
5. Blood Serotonin in Schizophrenie Children
- Author
-
Campbell M, Breuer H, Patrick Collins, Friedman E, Small Am, and Green Wh
- Subjects
medicine.medical_specialty ,Psychosis ,business.industry ,Internal medicine ,Statistical significance ,medicine ,In patient ,General Medicine ,Serotonin ,Psychiatry ,business ,medicine.disease ,Gastroenterology - Abstract
Blood serotonin levels were measured in schizophrenic children, all of whom showed manifestations of illness in the first 2 years of life, and controls. Serotonin levels were higher in patients (mean = 0.267 mug/ml) than in controls (mean = 0.218 mug/ml), although the difference did not reach statistical significance. Serotonin levels were significantly higher in patients with florid psychosis and those with lower IQs than in patients in remission or partial remission or higher IQs.
- Published
- 1975
- Full Text
- View/download PDF
6. Integrating Machine Learning and Large Language Models to Advance Exploration of Electrochemical Reactions.
- Author
-
Zheng Z, Florit F, Jin B, Wu H, Li SC, Nandiwale KY, Salazar CA, Mustakis JG, Green WH, and Jensen KF
- Abstract
Electrochemical C-H oxidation reactions offer a sustainable route to functionalize hydrocarbons, yet identifying suitable substrates and optimizing synthesis remain challenging. Here, we report an integrated approach combining machine learning and large language models to streamline the exploration of electrochemical C-H oxidation reactions. Utilizing a batch rapid screening electrochemical platform, we evaluated a wide range of reactions, initially classifying substrates by their reactivity, while LLMs text-mined literature data to augment the training set. The resulting ML models for reactivity prediction achieved high accuracy (>90 %) and enabled virtual screening of a large set of commercially available molecules. To optimize reaction conditions for selected substrates, LLMs were prompted to generate code that iteratively improved yields. This human-AI collaboration proved effective, efficiently identifying high-yield conditions for 8 drug-like substances or intermediates. Notably, we benchmarked the accuracy and reliability of 12 different LLMs-including LLaMA series, Claude series, OpenAI o1, and GPT-4-on code generation and function calling related to ML based on natural language prompts given by chemists to showcase potentials for accelerating research across four diverse tasks. In addition, we collected an experimental benchmark dataset comprising 1071 reaction conditions and yields for electrochemical C-H oxidation reactions., (© 2024 The Author(s). Angewandte Chemie International Edition published by Wiley-VCH GmbH.)
- Published
- 2025
- Full Text
- View/download PDF
7. Widespread Misinterpretation of p K a Terminology for Zwitterionic Compounds and Its Consequences.
- Author
-
Zheng JW, Leito I, and Green WH
- Subjects
- Hydrogen-Ion Concentration, Databases, Chemical, Acids chemistry, Terminology as Topic
- Abstract
The acid dissociation constant (p K
a ), which quantifies the propensity for a solute to donate a proton to its solvent, is crucial for drug design and synthesis, environmental fate studies, chemical manufacturing, and many other fields. Unfortunately, the terminology used for describing acid-base phenomena is sometimes inconsistent, causing large potential for misinterpretation. In this work, we examine a systematic confusion underlying the definition of "acidic" and "basic" p Ka values for zwitterionic compounds. Due to this confusion, some p Ka data are misrepresented in data repositories, including the widely used and highly trusted ChEMBL database. Such datasets are frequently used to supply training data for p Ka prediction models, and hence, confusion and errors in the data make the model performance worse. Herein, we discuss the intricacies of this issue. We make suggestions for describing acid-base phenomena, training p Ka prediction models, and stewarding p Ka datasets, given the high potential for confusion and potentially high impact in downstream applications.- Published
- 2024
- Full Text
- View/download PDF
8. Accurately Predicting Barrier Heights for Radical Reactions in Solution Using Deep Graph Networks.
- Author
-
Spiekermann KA, Dong X, Menon A, Green WH, Pfeifle M, Sandfort F, Welz O, and Bergeler M
- Abstract
Quantitative estimates of reaction barriers and solvent effects are essential for developing kinetic mechanisms and predicting reaction outcomes. Here, we create a new data set of 5,600 unique elementary radical reactions calculated using the M06-2X/def2-QZVP//B3LYP-D3(BJ)/def2-TZVP level of theory. A conformer search is done for each species using TPSS/def2-TZVP. Gibbs free energies of activation and of reaction for these radical reactions in 40 common solvents are obtained using COSMO-RS for solvation effects. These balanced reactions involve the elements H, C, N, O, and S, contain up to 19 heavy atoms, and have atom-mapped SMILES. All transition states are verified by an intrinsic reaction coordinate calculation. We next train a deep graph network to directly estimate the Gibbs free energy of activation and of reaction in both gas and solution phases using only the atom-mapped SMILES of the reactant and product and the SMILES of the solvent. This simple input representation avoids computationally expensive optimizations for the reactant, transition state, and product structures during inference, making our model well-suited for high-throughput predictive chemistry and quickly providing information for (retro-)synthesis planning tools. To properly measure model performance, we report results on both interpolative and extrapolative data splits and also compare to several baseline models. During training and testing, the data set is augmented by including the reverse direction of each reaction and variants with different resonance structures. After data augmentation, we have around 2 million entries to train the model, which achieves a testing set mean absolute error of 1.16 kcal mol
-1 for the Gibbs free energy of activation in solution. We anticipate this model will accelerate predictions for high-throughput screening to quickly identify relevant reactions in solution, and our data set will serve as a benchmark for future studies.- Published
- 2024
- Full Text
- View/download PDF
9. When Do Quantum Mechanical Descriptors Help Graph Neural Networks to Predict Chemical Properties?
- Author
-
Li SC, Wu H, Menon A, Spiekermann KA, Li YP, and Green WH
- Abstract
Deep graph neural networks are extensively utilized to predict chemical reactivity and molecular properties. However, because of the complexity of chemical space, such models often have difficulty extrapolating beyond the chemistry contained in the training set. Augmenting the model with quantum mechanical (QM) descriptors is anticipated to improve its generalizability. However, obtaining QM descriptors often requires CPU-intensive computational chemistry calculations. To identify when QM descriptors help graph neural networks predict chemical properties, we conduct a systematic investigation of the impact of atom, bond, and molecular QM descriptors on the performance of directed message passing neural networks (D-MPNNs) for predicting 16 molecular properties. The analysis surveys computational and experimental targets, as well as classification and regression tasks, and varied data set sizes from several hundred to hundreds of thousands of data points. Our results indicate that QM descriptors are mostly beneficial for D-MPNN performance on small data sets, provided that the descriptors correlate well with the targets and can be readily computed with high accuracy. Otherwise, using QM descriptors can add cost without benefit or even introduce unwanted noise that can degrade model performance. Strategic integration of QM descriptors with D-MPNN unlocks potential for physics-informed, data-efficient modeling with some interpretability that can streamline de novo drug and material designs. To facilitate the use of QM descriptors in machine learning workflows for chemistry, we provide a set of guidelines regarding when and how to best leverage QM descriptors, a high-throughput workflow to compute them, and an enhancement to Chemprop, a widely adopted open-source D-MPNN implementation for chemical property prediction.
- Published
- 2024
- Full Text
- View/download PDF
10. Toward Accurate Quantum Mechanical Thermochemistry: (1) Extensible Implementation and Comparison of Bond Additivity Corrections and Isodesmic Reactions.
- Author
-
Wu H, Payne AM, Pang HW, Menon A, Grambow CA, Ranasinghe DS, Dong X, Grinberg Dana A, and Green WH
- Abstract
Obtaining accurate enthalpies of formation of chemical species, Δ H
f , often requires empirical corrections that connect the results of quantum mechanical (QM) calculations with the experimental enthalpies of elements in their standard state. One approach is to use atomization energy corrections followed by bond additivity corrections (BACs), such as those defined by Petersson et al. or Anantharaman and Melius. Another approach is to utilize isodesmic reactions (IDRs) as shown by Buerger et al. We implement both approaches in Arkane, an open-source software that can calculate species thermochemistry using results from various QM software packages. In this work, we collect 421 reference species from the literature to derive Δ Hf corrections and fit atomization energy corrections and BACs for 15 commonly used model chemistries. We find that both types of BACs yield similar accuracy, although Anantharaman- and Melius-type BACs appear to generalize better. Furthermore, BACs tend to achieve better accuracy than IDRs for commonly used model chemistries, and IDRs can be less robust because of the sensitivity to the chosen reference species and reactions. Overall, Anantharaman- and Melius-type BACs are our recommended approach for achieving accurate QM corrections for enthalpies.- Published
- 2024
- Full Text
- View/download PDF
11. Superficial X-ray in the treatment of nonaggressive basal and squamous cell carcinoma in the elderly: A 22-year retrospective analysis.
- Author
-
Mattia A, Thompson A, Lee SK, Hong HG, Green WH, and Cognetta AB Jr
- Subjects
- Humans, Aged, Retrospective Studies, X-Rays, Radiography, Carcinoma, Squamous Cell diagnostic imaging, Skin Neoplasms diagnostic imaging, Skin Neoplasms pathology, Carcinoma, Basal Cell diagnostic imaging, Carcinoma, Basal Cell surgery
- Abstract
Competing Interests: Conflicts of interest None disclosed.
- Published
- 2024
- Full Text
- View/download PDF
12. Subgraph Isomorphic Decision Tree to Predict Radical Thermochemistry with Bounded Uncertainty Estimation.
- Author
-
Pang HW, Dong X, Johnson MS, and Green WH
- Abstract
Detailed chemical kinetic models offer valuable mechanistic insights into industrial applications. Automatic generation of reliable kinetic models requires fast and accurate radical thermochemistry estimation. Kineticists often prefer hydrogen bond increment (HBI) corrections from a closed-shell molecule to the corresponding radical for their interpretability, physical meaning, and facilitation of error cancellation as a relative quantity. Tree estimators, used due to limited data, currently rely on expert knowledge and manual construction, posing challenges in maintenance and improvement. In this work, we extend the subgraph isomorphic decision tree (SIDT) algorithm originally developed for rate estimation to estimate HBI corrections. We introduce a physics-aware splitting criterion, explore a bounded weighted uncertainty estimation method, and evaluate aleatoric uncertainty-based and model variance reduction-based prepruning methods. Moreover, we compile a data set of thermochemical parameters for 2210 radicals involving C, O, N, and H based on quantum chemical calculations from recently published works. We leverage the collected data set to train the SIDT model. Compared to existing empirical tree estimators, the SIDT model (1) offers an automatic approach to generating and extending the tree estimator for thermochemistry, (2) has better accuracy and R
2 , (3) provides significantly more realistic uncertainty estimates, and (4) has a tree structure much more advantageous in descent speed. Overall, the SIDT estimator marks a great leap in kinetic modeling, offering more precise, reliable, and scalable predictions for radical thermochemistry.- Published
- 2024
- Full Text
- View/download PDF
13. Machine learning from quantum chemistry to predict experimental solvent effects on reaction rates.
- Author
-
Chung Y and Green WH
- Abstract
Fast and accurate prediction of solvent effects on reaction rates are crucial for kinetic modeling, chemical process design, and high-throughput solvent screening. Despite the recent advance in machine learning, a scarcity of reliable data has hindered the development of predictive models that are generalizable for diverse reactions and solvents. In this work, we generate a large set of data with the COSMO-RS method for over 28 000 neutral reactions and 295 solvents and train a machine learning model to predict the solvation free energy and solvation enthalpy of activation (ΔΔ G
‡ solv , ΔΔ H‡ solv ) for a solution phase reaction. On unseen reactions, the model achieves mean absolute errors of 0.71 and 1.03 kcal mol-1 for ΔΔ G‡ solv and ΔΔ H‡ solv , respectively, relative to the COSMO-RS calculations. The model also provides reliable predictions of relative rate constants within a factor of 4 when tested on experimental data. The presented model can provide nearly instantaneous predictions of kinetic solvent effects or relative rate constants for a broad range of neutral closed-shell or free radical reactions and solvents only based on atom-mapped reaction SMILES and solvent SMILES strings., Competing Interests: There are no conflicts to declare., (This journal is © The Royal Society of Chemistry.)- Published
- 2024
- Full Text
- View/download PDF
14. Chemprop: A Machine Learning Package for Chemical Property Prediction.
- Author
-
Heid E, Greenman KP, Chung Y, Li SC, Graff DE, Vermeire FH, Wu H, Green WH, and McGill CJ
- Subjects
- Neural Networks, Computer, Chemical Phenomena, Water, Machine Learning, Software
- Abstract
Deep learning has become a powerful and frequently employed tool for the prediction of molecular properties, thus creating a need for open-source and versatile software solutions that can be operated by nonexperts. Among the current approaches, directed message-passing neural networks (D-MPNNs) have proven to perform well on a variety of property prediction tasks. The software package Chemprop implements the D-MPNN architecture and offers simple, easy, and fast access to machine-learned molecular properties. Compared to its initial version, we present a multitude of new Chemprop functionalities such as the support of multimolecule properties, reactions, atom/bond-level properties, and spectra. Further, we incorporate various uncertainty quantification and calibration methods along with related metrics as well as pretraining and transfer learning workflows, improved hyperparameter optimization, and other customization options concerning loss functions or atom/bond features. We benchmark D-MPNN models trained using Chemprop with the new reaction, atom-level, and spectra functionality on a variety of property prediction data sets, including MoleculeNet and SAMPL, and observe state-of-the-art performance on the prediction of water-octanol partition coefficients, reaction barrier heights, atomic partial charges, and absorption spectra. Chemprop enables out-of-the-box training of D-MPNN models for a variety of problem settings in fast, user-friendly, and open-source software.
- Published
- 2024
- Full Text
- View/download PDF
15. Autonomous, multiproperty-driven molecular discovery: From predictions to measurements and back.
- Author
-
Koscher BA, Canty RB, McDonald MA, Greenman KP, McGill CJ, Bilodeau CL, Jin W, Wu H, Vermeire FH, Jin B, Hart T, Kulesza T, Li SC, Jaakkola TS, Barzilay R, Gómez-Bombarelli R, Green WH, and Jensen KF
- Abstract
A closed-loop, autonomous molecular discovery platform driven by integrated machine learning tools was developed to accelerate the design of molecules with desired properties. We demonstrated two case studies on dye-like molecules, targeting absorption wavelength, lipophilicity, and photooxidative stability. In the first study, the platform experimentally realized 294 unreported molecules across three automatic iterations of molecular design-make-test-analyze cycles while exploring the structure-function space of four rarely reported scaffolds. In each iteration, the property prediction models that guided exploration learned the structure-property space of diverse scaffold derivatives, which were realized with multistep syntheses and a variety of reactions. The second study exploited property models trained on the explored chemical space and previously reported molecules to discover nine top-performing molecules within a lightly explored structure-property space.
- Published
- 2023
- Full Text
- View/download PDF
16. Experimental Compilation and Computation of Hydration Free Energies for Ionic Solutes.
- Author
-
Zheng JW and Green WH
- Abstract
Although charged solutes are common in many chemical systems, traditional solvation models perform poorly in calculating solvation energies of ions. One major obstacle is the scarcity of experimental data for solvated ions. In this study, we release an experiment-based aqueous ionic solvation energy data set, IonSolv-Aq, that contains hydration free energies for 118 anions and 155 cations, more than 2 times larger than the set of hydration free energies for singly charged ions contained in the 2012 Minnesota Solvation Database commonly used in benchmarking studies. We discuss sources of systematic uncertainty in the data set and use the data to examine the accuracy of popular implicit solvation models COSMO-RS and SMD for predicting solvation free energies of singly charged ionic solutes in water. Our results indicate that most SMD and COSMO-RS modeling errors for ionic solutes are systematic and correctable with empirical parameters. We discuss two systematic offsets: one across all ions and one that depends on the functional group of the ionization site. After correcting for these offsets, solvation energies of singly charged ions are predicted using COSMO-RS to 3.1 kcal mol
-1 MAE against a challenging test set and 1.7 kcal mol-1 MAE (about 3% relative error) with a filtered test set. The performance of SMD is similar, with MAE against those same test sets of 2.7 and 1.7 kcal mol-1 . These results underscore the importance of compiling larger experimental data sets to improve solvation model parametrization and fairly assess performance.- Published
- 2023
- Full Text
- View/download PDF
17. ConfSolv: Prediction of Solute Conformer-Free Energies across a Range of Solvents.
- Author
-
Pattanaik L, Menon A, Settels V, Spiekermann KA, Tan Z, Vermeire FH, Sandfort F, Eiden P, and Green WH
- Abstract
Predicting Gibbs free energy of solution is key to understanding the solvent effects on thermodynamics and reaction rates for kinetic modeling. Accurately computing solution free energies requires the enumeration and evaluation of relevant solute conformers in solution. However, even after generation of relevant conformers, determining their free energy of solution requires an expensive workflow consisting of several ab initio computational chemistry calculations. To help address this challenge, we generate a large data set of solution free energies for nearly 44,000 solutes with almost 9 million conformers calculated in 41 different solvents using density functional theory and COSMO-RS and quantify the impact of solute conformers on the solution free energy. We then train a message passing neural network to predict the relative solution free energies of a set of solute conformers, enabling the identification of a small subset of thermodynamically relevant conformers. The model offers substantial computational time savings with predictions usually substantially within 1 kcal/mol of the free energy of the solution calculated by using computational chemical methods.
- Published
- 2023
- Full Text
- View/download PDF
18. EnzymeMap: curation, validation and data-driven prediction of enzymatic reactions.
- Author
-
Heid E, Probst D, Green WH, and Madsen GKH
- Abstract
Enzymatic reactions are an ecofriendly, selective, and versatile addition, sometimes even alternative to organic reactions for the synthesis of chemical compounds such as pharmaceuticals or fine chemicals. To identify suitable reactions, computational models to predict the activity of enzymes on non-native substrates, to perform retrosynthetic pathway searches, or to predict the outcomes of reactions including regio- and stereoselectivity are becoming increasingly important. However, current approaches are substantially hindered by the limited amount of available data, especially if balanced and atom mapped reactions are needed and if the models feature machine learning components. We therefore constructed a high-quality dataset (EnzymeMap) by developing a large set of correction and validation algorithms for recorded reactions in the literature and showcase its significant positive impact on machine learning models of retrosynthesis, forward prediction, and regioselectivity prediction, outperforming previous approaches by a large margin. Our dataset allows for deep learning models of enzymatic reactions with unprecedented accuracy, and is freely available online., Competing Interests: There are no conflicts to declare., (This journal is © The Royal Society of Chemistry.)
- Published
- 2023
- Full Text
- View/download PDF
19. On the accuracy of the chemically significant eigenvalue method.
- Author
-
Holtorf F and Green WH
- Abstract
We study the accuracy and convergence properties of the chemically significant eigenvalues method as proposed by Georgievskii et al. [J. Phys. Chem. A 117, 12146-12154 (2013)] and its close relative, dominant subspace truncation, for reduction of the energy-grained master equation. We formally derive the connection between both reduction techniques and provide hard error bounds for the accuracy of the latter which confirm the empirically excellent accuracy and convergence properties but also unveil practically relevant cases in which both methods are bound to fall short. We propose the use of balanced truncation as an effective alternative in these cases., (© 2023 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).)
- Published
- 2023
- Full Text
- View/download PDF
20. Predicting Critical Properties and Acentric Factors of Fluids Using Multitask Machine Learning.
- Author
-
Biswas S, Chung Y, Ramirez J, Wu H, and Green WH
- Subjects
- Temperature, Transition Temperature, Machine Learning, Neural Networks, Computer
- Abstract
Knowledge of critical properties, such as critical temperature, pressure, density, as well as acentric factor, is essential to calculate thermo-physical properties of chemical compounds. Experiments to determine critical properties and acentric factors are expensive and time intensive; therefore, we developed a machine learning (ML) model that can predict these molecular properties given the SMILES representation of a chemical species. We explored directed message passing neural network (D-MPNN) and graph attention network as ML architecture choices. Additionally, we investigated featurization with additional atomic and molecular features, multitask training, and pretraining using estimated data to optimize model performance. Our final model utilizes a D-MPNN layer to learn the molecular representation and is supplemented by Abraham parameters. A multitask training scheme was used to train a single model to predict all the critical properties and acentric factors along with boiling point, melting point, enthalpy of vaporization, and enthalpy of fusion. The model was evaluated on both random and scaffold splits where it shows state-of-the-art accuracies. The extensive data set of critical properties and acentric factors contains 1144 chemical compounds and is made available in the public domain together with the source code that can be used for further exploration.
- Published
- 2023
- Full Text
- View/download PDF
21. Computing Kinetic Solvent Effects and Liquid Phase Rate Constants Using Quantum Chemistry and COSMO-RS Methods.
- Author
-
Chung Y and Green WH
- Abstract
Many industrially and environmentally relevant reactions occur in the liquid phase. An accurate prediction of the rate constants is needed to analyze the intricate kinetic mechanisms of condensed phase systems. Quantum chemistry and continuum solvation models are commonly used to compute liquid phase rate constants; yet, their exact computational errors remain largely unknown, and a consistent computational workflow has not been well established. In this study, the accuracies of various quantum chemical and COSMO-RS levels of theory are assessed for the predictions of liquid phase rate constants and kinetic solvent effects. The prediction is made by first obtaining gas phase rate constants and subsequently applying solvation corrections. The calculation errors are evaluated using the experimental data of 191 rate constants that comprise 15 neutral closed-shell or free radical reactions and 49 solvents. The ωB97XD/def2-TZVP level of theory combined with the COSMO-RS method at the BP-TZVP level is shown to achieve the best performance with a mean absolute error of 0.90 in log
10 ( kliq ). Relative rate constants are additionally compared to determine the errors associated with the solvation calculations alone. Very accurate predictions of relative rate constants are achieved at nearly all levels of theory with a mean absolute error of 0.27 in log10 ( ksolvent1 / ksolvent2 ).- Published
- 2023
- Full Text
- View/download PDF
22. Characterizing Uncertainty in Machine Learning for Chemistry.
- Author
-
Heid E, McGill CJ, Vermeire FH, and Green WH
- Subjects
- Uncertainty, Reproducibility of Results, Machine Learning
- Abstract
Characterizing uncertainty in machine learning models has recently gained interest in the context of machine learning reliability, robustness, safety, and active learning. Here, we separate the total uncertainty into contributions from noise in the data (aleatoric) and shortcomings of the model (epistemic), further dividing epistemic uncertainty into model bias and variance contributions. We systematically address the influence of noise, model bias, and model variance in the context of chemical property predictions, where the diverse nature of target properties and the vast chemical chemical space give rise to many different distinct sources of prediction error. We demonstrate that different sources of error can each be significant in different contexts and must be individually addressed during model development. Through controlled experiments on data sets of molecular properties, we show important trends in model performance associated with the level of noise in the data set, size of the data set, model architecture, molecule representation, ensemble size, and data set splitting. In particular, we show that 1) noise in the test set can limit a model's observed performance when the actual performance is much better, 2) using size-extensive model aggregation structures is crucial for extensive property prediction, and 3) ensembling is a reliable tool for uncertainty quantification and improvement specifically for the contribution of model variance. We develop general guidelines on how to improve an underperforming model when falling into different uncertainty contexts.
- Published
- 2023
- Full Text
- View/download PDF
23. Well water induced lichen planus: a case report.
- Author
-
Rodriguez GM, Getz AJ, Schaffer A, and Green WH
- Subjects
- Humans, Lichen Planus diagnosis, Drinking Water adverse effects
- Published
- 2023
- Full Text
- View/download PDF
24. Butyl Acetate Pyrolysis and Combustion Chemistry: Mechanism Generation and Shock Tube Experiments.
- Author
-
Dong X, Pio G, Arafin F, Laich A, Baker J, Ninnemann E, Vasu SS, and Green WH
- Abstract
The combustion and pyrolysis behaviors of light esters and fatty acid methyl esters have been widely studied due to their relevance as biofuel and fuel additives. However, a knowledge gap exists for midsize alkyl acetates, especially ones with long alkoxyl groups. Butyl acetate, in particular, is a promising biofuel with its economic and robust production possibilities and ability to enhance blendstock performance and reduce soot formation. However, it is little studied from both experimental and modeling aspects. This work created detailed oxidation mechanisms for the four butyl acetate isomers (normal-, sec-, tert-, and iso-butyl acetate) at temperatures varying from 650 to 2000 K and pressures up to 100 atm using the Reaction Mechanism Generator. About 60% of species in each model have thermochemical parameters from published data or in-house quantum calculations, including fuel molecules and intermediate combustion products. Kinetics of essential primary reactions, retro-ene and hydrogen atom abstraction by OH or HO
2 , governing the fuel oxidation pathways, were also calculated quantum-mechanically. Simulation of the developed mechanisms indicates that the majority of the fuel will decompose into acetic acid and relevant butenes at elevated temperatures, making their ignition behaviors similar to butenes. The adaptability of the developed models to high-temperature pyrolysis systems was tested against newly collected high-pressure shock experiments; the simulated CO mole fraction time histories have a reasonable agreement with the laser measurement in the shock tube. This work reveals the high-temperature oxidation chemistry of butyl acetates and demonstrates the validity of predictive models for biofuel chemistry established on accurate thermochemical and kinetic parameters.- Published
- 2023
- Full Text
- View/download PDF
25. Endocrine mucin-producing sweat gland carcinoma with regional metastases in an African American female.
- Author
-
Mattia A, Thompson A, Green WH, and Cognetta AB Jr
- Abstract
Competing Interests: None disclosed.
- Published
- 2023
- Full Text
- View/download PDF
26. Rapidly Growing Nodule Within a Previously Radiated Area of the Scalp.
- Author
-
Thompson A, Mattia A, Dolson D, Schaffer A, and Green WH
- Subjects
- Humans, Scalp, Skin Neoplasms diagnosis, Skin Neoplasms radiotherapy
- Published
- 2023
- Full Text
- View/download PDF
27. A 10-year follow-up on the chemopreventive role of photodynamic therapy in a Gorlin syndrome patient.
- Author
-
Thompson A, Mattia A, Green WH, and Cognetta AB Jr
- Subjects
- Humans, Follow-Up Studies, Aminolevulinic Acid therapeutic use, Photosensitizing Agents therapeutic use, Basal Cell Nevus Syndrome drug therapy, Photochemotherapy, Carcinoma, Basal Cell drug therapy, Skin Neoplasms drug therapy, Skin Neoplasms prevention & control
- Published
- 2022
- Full Text
- View/download PDF
28. RMG Database for Chemical Property Prediction.
- Author
-
Johnson MS, Dong X, Grinberg Dana A, Chung Y, Farina D Jr, Gillis RJ, Liu M, Yee NW, Blondal K, Mazeau E, Grambow CA, Payne AM, Spiekermann KA, Pang HW, Goldsmith CF, West RH, and Green WH
- Subjects
- Kinetics, Thermodynamics, Databases, Factual, Solvents, Models, Chemical
- Abstract
The Reaction Mechanism Generator (RMG) database for chemical property prediction is presented. The RMG database consists of curated datasets and estimators for accurately predicting the parameters necessary for constructing a wide variety of chemical kinetic mechanisms. These datasets and estimators are mostly published and enable prediction of thermodynamics, kinetics, solvation effects, and transport properties. For thermochemistry prediction, the RMG database contains 45 libraries of thermochemical parameters with a combination of 4564 entries and a group additivity scheme with 9 types of corrections including radical, polycyclic, and surface absorption corrections with 1580 total curated groups and parameters for a graph convolutional neural network trained using transfer learning from a set of >130 000 DFT calculations to 10 000 high-quality values. Correction schemes for solvent-solute effects, important for thermochemistry in the liquid phase, are available. They include tabulated values for 195 pure solvents and 152 common solutes and a group additivity scheme for predicting the properties of arbitrary solutes. For kinetics estimation, the database contains 92 libraries of kinetic parameters containing a combined 21 000 reactions and contains rate rule schemes for 87 reaction classes trained on 8655 curated training reactions. Additional libraries and estimators are available for transport properties. All of this information is easily accessible through the graphical user interface at https://rmg.mit.edu. Bulk or on-the-fly use can be facilitated by interfacing directly with the RMG Python package which can be installed from Anaconda. The RMG database provides kineticists with easy access to estimates of the many parameters they need to model and analyze kinetic systems. This helps to speed up and facilitate kinetic analysis by enabling easy hypothesis testing on pathways, by providing parameters for model construction, and by providing checks on kinetic parameters from other sources.
- Published
- 2022
- Full Text
- View/download PDF
29. Examining the accuracy of methods for obtaining pressure dependent rate coefficients.
- Author
-
Johnson MS and Green WH
- Subjects
- Kinetics, Computer Simulation, Least-Squares Analysis, Models, Theoretical
- Abstract
The full energy-grained master equation (ME) is too large to be conveniently used in kinetic modeling, so almost always it is replaced by a reduced model using phenomenological rate coefficients. The accuracy of several methods for obtaining these pressure-dependent phenomenological rate coefficients, and so for constructing a reduced model, is tested against direct numerical solutions of the full ME, and the deviations are sometimes quite large. An algebraic expression for the error between the popular chemically-significant eigenvalue (CSE) method and the exact ME solution is derived. An alternative way to compute phenomenological rate coefficients, simulation least-squares (SLS), is presented. SLS is often about as accurate as CSE, and sometimes has significant advantages over CSE. One particular variant of SLS, using the matrix exponential, is as fast as CSE, and seems to be more robust. However, all of the existing methods for constructing reduced models to approximate the ME, including CSE and SLS, are inaccurate under some conditions, and sometimes they fail dramatically due to numerical problems. The challenge of constructing useful reduced models that more reliably emulate the full ME solution is discussed.
- Published
- 2022
- Full Text
- View/download PDF
30. The reaction step: general discussion.
- Author
-
Burke MP, Casavecchia P, Cavallotti C, Clary DC, Doner A, Green WH, Grinberg Dana A, Guo H, Heathcote D, Hochlaf M, Klippenstein SJ, Kuwata KT, Lawrence JE, Lourderaj U, Mebel AM, Milesevic D, Mullin AS, Nguyen TL, Olzmann M, Orr-Ewing AJ, Osborn DL, Pazdera TM, Robertson PA, Robinson MS, Rotavera B, Seakins PW, Shannon RJ, Shiels OJ, Suits AG, Trevitt AJ, Troe J, Vallance C, Welz O, Zhang F, and Zádor J
- Published
- 2022
- Full Text
- View/download PDF
31. Impact of Lindemann and related theories: general discussion.
- Author
-
Bodi A, Burke MP, Butler AA, Douglas K, Eskola AJ, Green WH, Guo H, Heard DE, Heathcote D, Hochlaf M, Klippenstein SJ, Kuwata KT, Lawrence JE, Lester MI, Lourderaj U, Mebel AM, Milesevic D, Mullin AS, Nguyen TL, Olzmann M, Orr-Ewing AJ, Osborn DL, Pazdera TM, Pfeifle M, Plane JMC, Pun R, Robertson PA, Robinson MS, Seakins PW, Shannon RJ, Taatjes CA, Troe J, Vallance C, Welz O, Zádor J, and Zhang F
- Published
- 2022
- Full Text
- View/download PDF
32. Collisional energy transfer: general discussion.
- Author
-
Babikov D, Burke MP, Casavecchia P, Green WH, Grinberg Dana A, Guo H, Heard DE, Heathcote D, Hochlaf M, Jasper AW, Klippenstein SJ, Lester MI, Martí C, Mebel AM, Mullin AS, Nguyen TL, Olzmann M, Orr-Ewing AJ, Osborn DL, Robertson PA, Robinson MS, Shannon RJ, Shiels OJ, Suits AG, Taatjes CA, Troe J, Xu X, You X, Zhang F, Zhang RM, and Zádor J
- Subjects
- Kinetics, Energy Transfer
- Published
- 2022
- Full Text
- View/download PDF
33. The master equation: general discussion.
- Author
-
Aerssens J, Burke MP, Cavallotti C, Green NJB, Green WH, Guo H, Heard D, Hochlaf M, Jasper AW, Klippenstein SJ, Kuwata KT, Lawrence JE, Mebel AM, Mullin AS, Nguyen TL, Olzmann M, Osborn DL, Pfeifle M, Plane JMC, Robertson PA, Robertson SH, Salzburger M, Seakins PW, Shannon RJ, Shiels OJ, Trevitt AJ, Vallance C, Welz O, Xu X, Zádor J, and Zhang RM
- Published
- 2022
- Full Text
- View/download PDF
34. Concluding remarks: Faraday Discussion on unimolecular reactions.
- Author
-
Green WH
- Abstract
This Faraday Discussion , marking the centenary of Lindemann's explanation of the pressure-dependence of unimolecular reactions, presented recent advances in measuring and computing collisional energy transfer efficiencies, microcanonical rate coefficients, and pressure-dependent (phenomenological) rate coefficients, and the incorporation of these rate coefficients in kinetic models. Several of the presentations featured systems where breakdown of the Born-Oppenheimer approximation is key to understanding the measured rates/products. Many of the reaction systems presented were quite complex, which can make it difficult to go from "plausible proposed explanation" to "quantitative agreement between model and experiment". This complexity highlights the need for better automation of the calculations, better documentation and benchmarking to catch any errors and to make the calculations more easily reproducible, and continued (and even closer) cooperation of experimentalists and modelers. In some situations the correct definition of a "species" is debatable, since the population distributions and time evolution are so distorted from the perfect-Boltzmann Lewis-structure zero-order concept of a chemical species. Despite all these challenges, the field has made tremendous advances, and several cases were presented which demonstrated both excellent understanding of very complicated reaction chemistry and quantitatively accurate predictions of complicated experiments. Some of the interesting contributions to this Discussion are highlighted here, with some comments and suggestions for next steps.
- Published
- 2022
- Full Text
- View/download PDF
35. High accuracy barrier heights, enthalpies, and rate coefficients for chemical reactions.
- Author
-
Spiekermann K, Pattanaik L, and Green WH
- Abstract
Quantitative chemical reaction data, including activation energies and reaction rates, are crucial for developing detailed kinetic mechanisms and accurately predicting reaction outcomes. However, such data are often difficult to find, and high-quality datasets are especially rare. Here, we use CCSD(T)-F12a/cc-pVDZ-F12//ωB97X-D3/def2-TZVP to obtain high-quality single point calculations for nearly 22,000 unique stable species and transition states. We report the results from these quantum chemistry calculations and extract the barrier heights and reaction enthalpies to create a kinetics dataset of nearly 12,000 gas-phase reactions. These reactions involve H, C, N, and O, contain up to seven heavy atoms, and have cleaned atom-mapped SMILES. Our higher-accuracy coupled-cluster barrier heights differ significantly (RMSE of ∼5 kcal mol
-1 ) relative to those calculated at ωB97X-D3/def2-TZVP. We also report accurate transition state theory rate coefficients [Formula: see text] between 300 K and 2000 K and the corresponding Arrhenius parameters for a subset of rigid reactions. We believe this data will accelerate development of automated and reliable methods for quantitative reaction prediction., (© 2022. The Author(s).)- Published
- 2022
- Full Text
- View/download PDF
36. Fast Predictions of Reaction Barrier Heights: Toward Coupled-Cluster Accuracy.
- Author
-
Spiekermann KA, Pattanaik L, and Green WH
- Subjects
- Kinetics, Thermodynamics
- Abstract
Quantitative estimates of reaction barriers are essential for developing kinetic mechanisms and predicting reaction outcomes. However, the lack of experimental data and the steep scaling of accurate quantum calculations often hinder the ability to obtain reliable kinetic values. Here, we train a directed message passing neural network on nearly 24,000 diverse gas-phase reactions calculated at CCSD(T)-F12a/cc-pVDZ-F12//ωB97X-D3/def2-TZVP. Our model uses 75% fewer parameters than previous studies, an improved reaction representation, and proper data splits to accurately estimate performance on unseen reactions. Using information from only the reactant and product, our model quickly predicts barrier heights with a testing MAE of 2.6 kcal mol
-1 relative to the coupled-cluster data, making it more accurate than a good density functional theory calculation. Furthermore, our results show that future modeling efforts to estimate reaction properties would significantly benefit from fine-tuning calibration using a transfer learning technique. We anticipate this model will accelerate and improve kinetic predictions for small molecule chemistry.- Published
- 2022
- Full Text
- View/download PDF
37. Predicting Solubility Limits of Organic Solutes for a Wide Range of Solvents and Temperatures.
- Author
-
Vermeire FH, Chung Y, and Green WH
- Subjects
- Data Collection, Solubility, Solutions, Solvents chemistry, Temperature, Thermodynamics, Water chemistry
- Abstract
The solubility of organic molecules is crucial in organic synthesis and industrial chemistry; it is important in the design of many phase separation and purification units, and it controls the migration of many species into the environment. To decide which solvents and temperatures can be used in the design of new processes, trial and error is often used, as the choice is restricted by unknown solid solubility limits. Here, we present a fast and convenient computational method for estimating the solubility of solid neutral organic molecules in water and many organic solvents for a broad range of temperatures. The model is developed by combining fundamental thermodynamic equations with machine learning models for solvation free energy, solvation enthalpy, Abraham solute parameters, and aqueous solid solubility at 298 K. We provide free open-source and online tools for the prediction of solid solubility limits and a curated data collection (SolProp) that includes more than 5000 experimental solid solubility values for validation of the model. The model predictions are accurate for aqueous systems and for a huge range of organic solvents up to 550 K or higher. Methods to further improve solid solubility predictions by providing experimental data on the solute of interest in another solvent, or on the solute's sublimation enthalpy, are also presented.
- Published
- 2022
- Full Text
- View/download PDF
38. Machine Learning of Reaction Properties via Learned Representations of the Condensed Graph of Reaction.
- Author
-
Heid E and Green WH
- Subjects
- Cheminformatics, Machine Learning, Neural Networks, Computer
- Abstract
The estimation of chemical reaction properties such as activation energies, rates, or yields is a central topic of computational chemistry. In contrast to molecular properties, where machine learning approaches such as graph convolutional neural networks (GCNNs) have excelled for a wide variety of tasks, no general and transferable adaptations of GCNNs for reactions have been developed yet. We therefore combined a popular cheminformatics reaction representation, the so-called condensed graph of reaction (CGR), with a recent GCNN architecture to arrive at a versatile, robust, and compact deep learning model. The CGR is a superposition of the reactant and product graphs of a chemical reaction and thus an ideal input for graph-based machine learning approaches. The model learns to create a data-driven, task-dependent reaction embedding that does not rely on expert knowledge, similar to current molecular GCNNs. Our approach outperforms current state-of-the-art models in accuracy, is applicable even to imbalanced reactions, and possesses excellent predictive capabilities for diverse target properties, such as activation energies, reaction enthalpies, rate constants, yields, or reaction classes. We furthermore curated a large set of atom-mapped reactions along with their target properties, which can serve as benchmark data sets for future work. All data sets and the developed reaction GCNN model are available online, free of charge, and open source.
- Published
- 2022
- Full Text
- View/download PDF
39. Kinetic Modeling of API Oxidation: (2) Imipramine Stress Testing.
- Author
-
Wu H, Grinberg Dana A, Ranasinghe DS, Pickard FC 4th, Wood GPF, Zelesky T, Sluggett GW, Mustakis J, and Green WH
- Subjects
- Drug Stability, Free Radicals, Kinetics, Oxidation-Reduction, Imipramine, Models, Chemical
- Abstract
Gauging the chemical stability of active pharmaceutical ingredients (APIs) is critical at various stages of pharmaceutical development to identify potential risks from drug degradation and ensure the quality and safety of the drug product. Stress testing has been the major experimental method to study API stability, but this analytical approach is time-consuming, resource-intensive, and limited by API availability, especially during the early stages of drug development. Novel computational chemistry methods may assist in screening for API chemical stability prior to synthesis and augment contemporary API stress testing studies, with the potential to significantly accelerate drug development and reduce costs. In this work, we leverage quantum chemical calculations and automated reaction mechanism generation to provide new insights into API degradation studies. In the continuation of part one in this series of studies [Grinberg Dana et al., Mol. Pharm. 2021 18 (8), 3037-3049], we have generated the first ab initio predictive chemical kinetic model of free-radical oxidative degradation for API stress testing. We focused on imipramine oxidation in an azobis(isobutyronitrile) (AIBN)/H
2 O/CH3 OH solution and compared the model's predictions with concurrent experimental observations. We analytically determined iminodibenzyl and desimipramine as imipramine's two major degradation products under industry-standard AIBN stress testing conditions, and our ab initio kinetic model successfully identified both of them in its prediction for the top three degradation products. This work shows the potential and utility of predictive chemical kinetic modeling and quantum chemical computations to elucidate API chemical stability issues. Further, we envision an automated digital workflow that integrates first-principle models with data-driven methods that, when actively and iteratively combined with high-throughput experiments, can substantially accelerate and transform future API chemical stability studies.- Published
- 2022
- Full Text
- View/download PDF
40. Similarity based enzymatic retrosynthesis.
- Author
-
Sankaranarayanan K, Heid E, Coley CW, Verma D, Green WH, and Jensen KF
- Abstract
Enzymes synthesize complex natural products effortlessly by catalyzing chemo-, regio-, and enantio-selective transformations. Further, biocatalytic processes are increasingly replacing conventional organic synthesis steps because they use mild solvents, avoid the use of metals, and reduce overall non-biodegradable waste. Here, we present a single-step retrosynthesis search algorithm to facilitate enzymatic synthesis of natural product analogs. First, we develop a tool, RDEnzyme, capable of extracting and applying stereochemically consistent enzymatic reaction templates, i.e. , subgraph patterns that describe the changes in connectivity between a product molecule and its corresponding reactant(s). Using RDEnzyme, we demonstrate that molecular similarity is an effective metric to propose retrosynthetic disconnections based on analogy to precedent enzymatic reactions in UniProt/RHEA. Using ∼5500 reactions from RHEA as a knowledge base, the recorded reactants to the product are among the top 10 proposed suggestions in 71% of ∼700 test reactions. Second, we trained a statistical model capable of discriminating between reaction pairs belonging to homologous enzymes and evolutionarily distant enzymes using ∼30 000 reaction pairs from SwissProt as a knowledge base. This model is capable of understanding patterns in enzyme promiscuity to evaluate the likelihood of experimental evolution success. By recursively applying the similarity-based single-step retrosynthesis and evolution prediction workflow, we successfully plan the enzymatic synthesis routes for both active pharmaceutical ingredients ( e.g. Islatravir, Molnupiravir) and commodity chemicals ( e.g. 1,4-butanediol, branched-chain higher alcohols/biofuels), in a retrospective fashion. Through the development and demonstration of the single-step enzymatic retrosynthesis strategy using natural transformations, our approach provides a first step towards solving the challenging problem of incorporating both enzyme- and organic-chemistry based transformations into a computer aided synthesis planning workflow., Competing Interests: There are no conflicts to declare., (This journal is © The Royal Society of Chemistry.)
- Published
- 2022
- Full Text
- View/download PDF
41. An Integrated Assessment of Emissions, Air Quality, and Public Health Impacts of China's Transition to Electric Vehicles.
- Author
-
Hsieh IL, Chossière GP, Gençer E, Chen H, Barrett S, and Green WH
- Abstract
Electric vehicles (EVs) are a promising pathway to providing cleaner personal mobility. China provides substantial supports to increase EV market share. This study provides an extensive analysis of the currently unclear environmental and health benefits of these incentives at the provincial level. EVs in China have modest cradle-to-gate CO
2 benefits (on average 29%) compared to conventional internal combustion engine vehicles (ICEVs), but have similar carbon emissions relative to hybrid electric vehicles. Well-to-wheel air pollutant emissions assessment shows that emissions associated with ICEVs are mainly from gasoline production, not the tailpipe, suggesting tighter emissions controls on refineries are needed to combat air pollution problems effectively. By integrating a vehicle fleet model into policy scenario analysis, we quantify the policy impacts associated with the passenger vehicles in the major Chinese provinces: broader EV penetration, especially combined with cleaner power generation, could deliver greater air quality and health benefits, but not necessarily significant climate change mitigation. The total value to society of the climate and mortality benefits in 2030 is found to be comparable to a prior estimate of the EV policy's economic costs.- Published
- 2022
- Full Text
- View/download PDF
42. Group Contribution and Machine Learning Approaches to Predict Abraham Solute Parameters, Solvation Free Energy, and Solvation Enthalpy.
- Author
-
Chung Y, Vermeire FH, Wu H, Walker PJ, Abraham MH, and Green WH
- Subjects
- Entropy, Solutions, Solvents, Thermodynamics, Machine Learning, Neural Networks, Computer
- Abstract
We present a group contribution method (SoluteGC) and a machine learning model (SoluteML) to predict the Abraham solute parameters, as well as a machine learning model (DirectML) to predict solvation free energy and enthalpy at 298 K. The proposed group contribution method uses atom-centered functional groups with corrections for ring and polycyclic strain while the machine learning models adopt a directed message passing neural network. The solute parameters predicted from SoluteGC and SoluteML are used to calculate solvation energy and enthalpy via linear free energy relationships. Extensive data sets containing 8366 solute parameters, 20,253 solvation free energies, and 6322 solvation enthalpies are compiled in this work to train the models. The three models are each evaluated on the same test sets using both random and substructure-based solute splits for solvation energy and enthalpy predictions. The results show that the DirectML model is superior to the SoluteML and SoluteGC models for both predictions and can provide accuracy comparable to that of advanced quantum chemistry methods. Yet, even though the DirectML model performs better in general, all three models are useful for various purposes. Uncertain predicted values can be identified by comparing the three models, and when the 3 models are combined together, they can provide even more accurate predictions than any one of them individually. Finally, we present our compiled solute parameter, solvation energy, and solvation enthalpy databases (SoluteDB, dGsolvDB x , dHsolvDB) and provide public access to our final prediction models through a simple web-based tool, software packages, and source code.
- Published
- 2022
- Full Text
- View/download PDF
43. Influence of Template Size, Canonicalization, and Exclusivity for Retrosynthesis and Reaction Prediction Applications.
- Author
-
Heid E, Liu J, Aude A, and Green WH
- Subjects
- Computers, Heuristics, Machine Learning, Algorithms, Software
- Abstract
Heuristic and machine learning models for rank-ordering reaction templates comprise an important basis for computer-aided organic synthesis regarding both product prediction and retrosynthetic pathway planning. Their viability relies heavily on the quality and characteristics of the underlying template database. With the advent of automated reaction and template extraction software and consequently the creation of template databases too large for manual curation, a data-driven approach to assess and improve the quality of template sets is needed. We therefore systematically studied the influence of template generality, canonicalization, and exclusivity on the performance of different template ranking models. We find that duplicate and nonexclusive templates, i.e., templates which describe the same chemical transformation on identical or overlapping sets of molecules, decrease both the accuracy of the ranking algorithm and the applicability of the respective top-ranked templates significantly. To remedy the negative effects of nonexclusivity, we developed a general and computationally efficient framework to deduplicate and hierarchically correct templates. As a result, performance improved considerably for both heuristic and machine learning template ranking models, as well as multistep retrosynthetic planning models. The canonicalization and correction code is made freely available.
- Published
- 2022
- Full Text
- View/download PDF
44. Multi-fidelity prediction of molecular optical peaks with deep learning.
- Author
-
Greenman KP, Green WH, and Gómez-Bombarelli R
- Abstract
Optical properties are central to molecular design for many applications, including solar cells and biomedical imaging. A variety of ab initio and statistical methods have been developed for their prediction, each with a trade-off between accuracy, generality, and cost. Existing theoretical methods such as time-dependent density functional theory (TD-DFT) are generalizable across chemical space because of their robust physics-based foundations but still exhibit random and systematic errors with respect to experiment despite their high computational cost. Statistical methods can achieve high accuracy at a lower cost, but data sparsity and unoptimized molecule and solvent representations often limit their ability to generalize. Here, we utilize directed message passing neural networks (D-MPNNs) to represent both dye molecules and solvents for predictions of molecular absorption peaks in solution. Additionally, we demonstrate a multi-fidelity approach based on an auxiliary model trained on over 28 000 TD-DFT calculations that further improves accuracy and generalizability, as shown through rigorous splitting strategies. Combining several openly-available experimental datasets, we benchmark these methods against a state-of-the-art regression tree algorithm and compare the D-MPNN solvent representation to several alternatives. Finally, we explore the interpretability of the learned representations using dimensionality reduction and evaluate the use of ensemble variance as an estimator of the epistemic uncertainty in our predictions of molecular peak absorption in solution. The prediction methods proposed herein can be integrated with active learning, generative modeling, and experimental workflows to enable the more rapid design of molecules with targeted optical properties., Competing Interests: There are no conflicts to declare., (This journal is © The Royal Society of Chemistry.)
- Published
- 2022
- Full Text
- View/download PDF
45. Chemistry of Simple Organic Peroxy Radicals under Atmospheric through Combustion Conditions: Role of Temperature, Pressure, and NO x Level.
- Author
-
Goldman MJ, Green WH, and Kroll JH
- Abstract
Organic peroxy radicals (RO
2 ) are key intermediates in the oxidation of organic compounds in both combustion systems and the atmosphere. While many studies have focused on reactions of RO2 in specific applications, spanning a relatively limited range of reaction conditions, the generalized behavior of RO2 radicals across the full range of reaction conditions (temperatures, pressures, and NO levels) has, to our knowledge, never been explored. In this work, two simple model systems, n -propyl peroxy radical and γ-isobutanol peroxy radical, are used to evaluate RO2 fate using pressure-dependent kinetics. The fate of these radicals was modeled based on literature data over 250-1250 K, 0.01-100 bar, and 1 ppt to 100 ppm of NO, which spans the typical range of atmospheric and combustion conditions. Covering this entire range provides a broad overview of the reactivity of these species under both atmospheric and combustion conditions, as well as under conditions intermediate to the two. A particular focus is on the importance of reactions that were traditionally considered to occur in only one of the two sets of conditions: RO2 unimolecular isomerization reactions (long known to occur in combustion systems but only recently appreciated in atmospheric systems) and RO2 bimolecular reactions of RO2 with NO (thought to occur mainly in atmospheric systems and rarely considered in combustion chemistry).- Published
- 2021
- Full Text
- View/download PDF
46. EHreact: Extended Hasse Diagrams for the Extraction and Scoring of Enzymatic Reaction Templates.
- Author
-
Heid E, Goldman S, Sankaranarayanan K, Coley CW, Flamm C, and Green WH
- Subjects
- Databases, Factual, Probability, Computers, Software
- Abstract
Data-driven computer-aided synthesis planning utilizing organic or biocatalyzed reactions from large databases has gained increasing interest in the last decade, sparking the development of numerous tools to extract, apply, and score general reaction templates. The generation of reaction rules for enzymatic reactions is especially challenging since substrate promiscuity varies between enzymes, causing the optimal levels of rule specificity and optimal number of included atoms to differ between enzymes. This complicates an automated extraction from databases and has promoted the creation of manually curated reaction rule sets. Here, we present EHreact, a purely data-driven open-source software tool, to extract and score reaction rules from sets of reactions known to be catalyzed by an enzyme at appropriate levels of specificity without expert knowledge. EHreact extracts and groups reaction rules into tree-like structures, Hasse diagrams, based on common substructures in the imaginary transition structures. Each diagram can be utilized to output a single or a set of reaction rules, as well as calculate the probability of a new substrate to be processed by the given enzyme by inferring information about the reactive site of the enzyme from the known reactions and their grouping in the template tree. EHreact heuristically predicts the activity of a given enzyme on a new substrate, outperforming current approaches in accuracy and functionality.
- Published
- 2021
- Full Text
- View/download PDF
47. Screening for New Pathways in Atmospheric Oxidation Chemistry with Automated Mechanism Generation.
- Author
-
Barber VP, Green WH, and Kroll JH
- Abstract
In the Earth's atmosphere, reactive organic carbon undergoes oxidation via a highly complex, multigeneration process, with implications for air quality and climate. Decades of experimental and theoretical studies, primarily on the reactions of hydrocarbons, have led to a canonical understanding of how gas-phase oxidation of organic compounds takes place. Recent research has brought to light a number of examples where the presence of certain functional groups opens up reaction pathways for key radical intermediates, including alkyl radicals, alkoxy radicals, and peroxy radicals, that are substantially different from traditional oxidation mechanisms. These discoveries highlight the need for methods that systematically explore the chemistry of complex, functionalized molecules without being prohibitively expensive. In this work, automated reaction network generation is used as a screening tool for new pathways in atmospheric oxidation chemistry. The reaction mechanism generator (RMG) is used to generate reaction networks for the OH-initiated oxidation of 200 mono- and bifunctionally substituted n -pentanes. The resulting networks are then filtered to highlight the reactions of key radical intermediates that are fast enough to compete with traditional atmospheric removal processes as well as "uncanonical" processes which differ from traditionally accepted oxidation mechanisms. Several recently reported, uncanonical atmospheric mechanisms appear in the RMG dataset. These "proof of concept" results provide confidence in this approach as a tool in the search for overlooked atmospheric oxidation chemistry. Several previously unreported reaction types are also encountered in the dataset. The most potentially atmospherically important of these is a radical-carbonyl ring-closure reaction that produces a highly functionalized cyclic alkoxy radical. This pathway is proposed as a promising target for further study via experiments and more detailed theoretical calculations. The approach presented herein represents a new way to efficiently explore atmospheric chemical space and unearth overlooked reaction steps in atmospheric oxidation.
- Published
- 2021
- Full Text
- View/download PDF
48. Kinetic Modeling of API Oxidation: (1) The AIBN/H 2 O/CH 3 OH Radical "Soup".
- Author
-
Grinberg Dana A, Wu H, Ranasinghe DS, Pickard FC 4th, Wood GPF, Zelesky T, Sluggett GW, Mustakis J, and Green WH
- Subjects
- Alcohols chemistry, Computer Simulation, Free Radicals chemistry, Hydrogen-Ion Concentration, Kinetics, Oxidation-Reduction, Reactive Oxygen Species chemistry, Software, Temperature, Methanol chemistry, Models, Chemical, Nitriles chemistry, Water chemistry
- Abstract
Stress testing of active pharmaceutical ingredients (API) is an important tool used to gauge chemical stability and identify potential degradation products. While different flavors of API stress testing systems have been used in experimental investigations for decades, the detailed kinetics of such systems as well as the chemical composition of prominent reactive species, specifically reactive oxygen species, are unknown. As a first step toward understanding and modeling API oxidation in stress testing, we investigated a typical radical "soup" solution an API is subject to during stress testing. Here we applied ab initio electronic structure calculations to automatically generate and refine a detailed chemical kinetics model, taking a fresh look at API oxidation. We generated a detailed kinetic model for a representative azobis(isobutyronitrile) (AIBN)/H
2 O/CH3 OH stress-testing system with a varied cosolvent ratio (50%/50%-99.5%/0.5% vol water/methanol) for 5.0 mM AIBN and representative pH values of 4-10 at 40 °C that was stirred and open to the atmosphere. At acidic conditions, hydroxymethyl alkoxyl is the dominant alkoxyl radical, and at basic conditions, for most studied initial methanol concentrations, cyanoisopropyl alkoxyl becomes the dominant alkoxyl radical, albeit at an overall lower concentration. At acidic conditions, the levels of cyanoisopropyl peroxyl, hydroxymethyl peroxyl, and hydroperoxyl radicals are relatively high and comparable, while, at both neutral and basic pH conditions, superoxide becomes the prominent radical in the system. The present work reveals the prominent species in a common model API stress testing system at various cosolvent and pH conditions, sets the stage for an in-depth quantitative API kinetic study, and demonstrates the usage of novel software tools for automated chemical kinetic model generation and ab initio refinement.- Published
- 2021
- Full Text
- View/download PDF
49. C 14 H 10 polycyclic aromatic hydrocarbon formation by acetylene addition to naphthalenyl radicals observed.
- Author
-
Yang J, Smith MC, Prendergast MB, Chu TC, and Green WH
- Abstract
The formation of polycyclic aromatic hydrocarbons (PAHs) during combustion has a substantial impact on environmental pollution and public health. The hydrogen-abstraction-acetylene-addition (HACA) mechanism is expected to be a significant source of larger PAHs containing more than two rings. In this study, the reactions of 1-naphthalenyl and 2-naphthalenyl radicals with acetylene (C2H2) are investigated using VUV photoionization time-of-flight mass spectrometry at 500 to 800 K, 15 to 50 torr, and reaction times up to 10 ms. Our experimental conditions allow us to probe the Bittner-Howard and modified Frenklach HACA routes, but not routes that require multiple radicals to drive the chemistry. The kinetic measurements are compared to a temperature-dependent kinetic model constructed using quantum chemistry calculations and accounting for chemical-activation and fall-off effects. We measure significant quantities of C14H10 (likely phenanthrene and anthracene), as well as 2-ethynylnaphthalene (C12H8), from the reaction of the 2-naphthalenyl radical with C2H2; these results are consistent with the predictions of the kinetic model and the HACA mechanism, but contradict a previous experimental study that indicated no C14H10 formation in the 2-naphthalenyl + C2H2 reaction. In the 1-naphthalenyl radical + C2H2 reaction system, the primary product measured is C12H8, consistent with the predicted formation of acenaphthylene via HACA. The present work provides direct experimental evidence that single-radical HACA can be an important mechanism for the formation of PAHs larger than naphthalene, validating a common assumption in combustion models.
- Published
- 2021
- Full Text
- View/download PDF
50. Predicting Infrared Spectra with Message Passing Neural Networks.
- Author
-
McGill C, Forsuelo M, Guan Y, and Green WH
- Subjects
- Software, Machine Learning, Neural Networks, Computer
- Abstract
Infrared (IR) spectroscopy remains an important tool for chemical characterization and identification. Chemprop-IR has been developed as a software package for the prediction of IR spectra through the use of machine learning. This work serves the dual purpose of providing a trained general-purpose model for the prediction of IR spectra with ease and providing the Chemprop-IR software framework for the training of new models. In Chemprop-IR, molecules are encoded using a directed message passing neural network, allowing for molecule latent representations to be learned and optimized for the task of spectral predictions. Model training incorporates spectra metrics and normalization techniques that offer better performance with spectral predictions than standard practice in regression models. The model makes use of pretraining using quantum chemistry calculations and ensembling of multiple submodels to improve generalizability and performance. The spectral predictions that result are of high quality, showing capability to capture the extreme diversity of spectral forms over chemical space and represent complex peak structures.
- Published
- 2021
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.