1. CYPstrate: A Set of Machine Learning Models for the Accurate Classification of Cytochrome P450 Enzyme Substrates and Non-Substrates
- Author
-
Christina de Bruyn Kops, Johannes Kirchmair, Conrad Stork, and Malte Holmer
- Subjects
cytochrome P450 enzymes ,Pharmaceutical Science ,Machine learning ,computer.software_genre ,01 natural sciences ,Isozyme ,Article ,Substrate Specificity ,Xenobiotics ,Analytical Chemistry ,Set (abstract data type) ,03 medical and health sciences ,chemistry.chemical_compound ,QD241-441 ,Cytochrome P-450 Enzyme System ,Drug Discovery ,Animals ,Humans ,Physical and Theoretical Chemistry ,CYP2C8 ,substrates ,030304 developmental biology ,chemistry.chemical_classification ,0303 health sciences ,CYP3A4 ,biology ,Drug discovery ,business.industry ,Organic Chemistry ,Cytochrome P450 ,0104 chemical sciences ,010404 medicinal & biomolecular chemistry ,Enzyme ,machine learning ,chemistry ,drug metabolism prediction ,classification ,Chemistry (miscellaneous) ,biology.protein ,Molecular Medicine ,Artificial intelligence ,business ,Xenobiotic ,computer - Abstract
The interaction of small organic molecules such as drugs, agrochemicals, and cosmetics with cytochrome P450 enzymes (CYPs) can lead to substantial changes in the bioavailability of active substances and hence consequences with respect to pharmacological efficacy and toxicity. Therefore, efficient means of predicting the interactions of small organic molecules with CYPs are of high importance to a host of different industries. In this work, we present a new set of machine learning models for the classification of xenobiotics into substrates and non-substrates of nine human CYP isozymes: CYPs 1A2, 2A6, 2B6, 2C8, 2C9, 2C19, 2D6, 2E1, and 3A4. The models are trained on an extended, high-quality collection of known substrates and non-substrates and have been subjected to thorough validation. Our results show that the models yield competitive performance and are favorable for the detection of CYP substrates. In particular, a new consensus model reached high performance, with Matthews correlation coefficients (MCCs) between 0.45 (CYP2C8) and 0.85 (CYP3A4), although at the cost of coverage. The best models presented in this work are accessible free of charge via the “CYPstrate” module of the New E-Resource for Drug Discovery (NERDD).
- Published
- 2021
- Full Text
- View/download PDF