Back to Search
Start Over
An automated PLS search for biologically relevant QSAR descriptors
- Source :
- Journal of Computer-Aided Molecular Design. 18:437-449
- Publication Year :
- 2004
- Publisher :
- Springer Science and Business Media LLC, 2004.
-
Abstract
- An automated PLS engine, WB-PLS, was applied to 1632 QSAR series with at least 25 compounds per series extracted from WOMBAT (WOrld of Molecular BioAcTivity). WB-PLS extracts a single Y variable per series, as well as pre-computed X variables from a table. The table contained 2D descriptors, the drug-like MDL 320 keys as implemented in the Mesa A&C Fingerprint module, and in-house generated topological-pharmacophore SMARTS counts and fingerprints. Each descriptor type was treated as a block, with or without scaling. Cross-validation, variable importance on projections (VIP) above 0.8 and q 2⩾0.3 were applied for model significance. Among cross-validation methods, leave-one-in-seven-out (CV7) is a better measure of model significance, compared to leave-one-out (measuring redundancy) and leave-half-out (too restrictive). SMARTS counts overlap with 2D descriptors (having a more quantitative nature), whereas MDL keys overlap with in-house fingerprints (both are more qualitative). The SMARTS counts is the most effective descriptor system, when compared to the other three. At the individual level, size-related descriptors and topological indices (in the 2D property space), and branched SMARTS, aromatic and ring atom types and halogens are found to be most relevant according to the VIP criterion.
- Subjects :
- Quantitative structure–activity relationship
Databases, Factual
Series (mathematics)
Information Storage and Retrieval
Quantitative Structure-Activity Relationship
Individual level
computer.software_genre
Computer Science Applications
Automation
Redundancy (information theory)
Fingerprint
Drug Discovery
Data mining
Physical and Theoretical Chemistry
computer
Mathematics
Block (data storage)
Subjects
Details
- ISSN :
- 15734951 and 0920654X
- Volume :
- 18
- Database :
- OpenAIRE
- Journal :
- Journal of Computer-Aided Molecular Design
- Accession number :
- edsair.doi.dedup.....3dfba4877b0235722b19dfb754678a35
- Full Text :
- https://doi.org/10.1007/s10822-004-4060-8