Back to Search
Start Over
Identification of Groupings of Graph Theoretical Molecular Descriptors Using a Hybrid Cluster Analysis Approach
- Source :
- Journal of Chemical Information and Computer Sciences. 40:1128-1146
- Publication Year :
- 2000
- Publisher :
- American Chemical Society (ACS), 2000.
-
Abstract
- There is an abundance of structural molecular descriptors of various forms that have been proposed and tested over the years. Very often different descriptors represent, more or less, the same aspects of molecular structures and, thus, they have diminished discriminating power for the identification of different structural features that might contribute to the molecular property, or activity of interest. Therefore, it is essential that noncorrelated descriptors be employed to ensure the wider and the less inflated possible coverage of the chemical space. The most usual approach for reducing the number of descriptors and employing noncorrelated (or orthogonal) descriptors involves principal component analysis (PCA) or other factor analytical techniques. In this work we present an approach for determining relationships (groupings) among 240 graph-theoretical descriptors, as a means for selecting nonredundant ones, based on the application of cluster analysis (CA). To remove inherent biases and particularities of different CA algorithms, several clustering solutions, using these algorithms, were "hybridized" to obtain a reliable and confident overall solution concerning how the interrelationships within the data are structured. The calculated correlation coefficients between descriptors were used as a reference for a discussion on the different CA methods employed, and the resulted clusters of descriptors were statistically analyzed for deriving the intercorrelations between the different operators, weighting schemes and matrices used for the computation of these descriptors.
- Subjects :
- Models, Molecular
Quantitative structure–activity relationship
Computer science
business.industry
Computation
Quantitative Structure-Activity Relationship
Pattern recognition
General Chemistry
Chemical space
Computer Science Applications
Weighting
Computational Theory and Mathematics
Molecular descriptor
Principal component analysis
Cluster Analysis
Combinatorial Chemistry Techniques
Graph (abstract data type)
Artificial intelligence
Cluster analysis
business
Information Systems
Subjects
Details
- ISSN :
- 00952338
- Volume :
- 40
- Database :
- OpenAIRE
- Journal :
- Journal of Chemical Information and Computer Sciences
- Accession number :
- edsair.doi.dedup.....1529fd5ada0981b039abb21120090ae0
- Full Text :
- https://doi.org/10.1021/ci990149y