Back to Search Start Over

An automated PLS search for biologically relevant QSAR descriptors

Authors :
Tudor I. Oprea
Marius Olah
Cristian Bologa
Source :
Journal of Computer-Aided Molecular Design. 18:437-449
Publication Year :
2004
Publisher :
Springer Science and Business Media LLC, 2004.

Abstract

An automated PLS engine, WB-PLS, was applied to 1632 QSAR series with at least 25 compounds per series extracted from WOMBAT (WOrld of Molecular BioAcTivity). WB-PLS extracts a single Y variable per series, as well as pre-computed X variables from a table. The table contained 2D descriptors, the drug-like MDL 320 keys as implemented in the Mesa A&C Fingerprint module, and in-house generated topological-pharmacophore SMARTS counts and fingerprints. Each descriptor type was treated as a block, with or without scaling. Cross-validation, variable importance on projections (VIP) above 0.8 and q 2⩾0.3 were applied for model significance. Among cross-validation methods, leave-one-in-seven-out (CV7) is a better measure of model significance, compared to leave-one-out (measuring redundancy) and leave-half-out (too restrictive). SMARTS counts overlap with 2D descriptors (having a more quantitative nature), whereas MDL keys overlap with in-house fingerprints (both are more qualitative). The SMARTS counts is the most effective descriptor system, when compared to the other three. At the individual level, size-related descriptors and topological indices (in the 2D property space), and branched SMARTS, aromatic and ring atom types and halogens are found to be most relevant according to the VIP criterion.

Details

ISSN :
15734951 and 0920654X
Volume :
18
Database :
OpenAIRE
Journal :
Journal of Computer-Aided Molecular Design
Accession number :
edsair.doi.dedup.....3dfba4877b0235722b19dfb754678a35
Full Text :
https://doi.org/10.1007/s10822-004-4060-8