Back to Search
Start Over
Comparing the CORAL and random forest approaches for modelling the in vitro cytotoxicity of silica nanomaterials
- Source :
- Scopus-Elsevier
-
Abstract
- Nanotechnology is one of the most important technological developments of the 21st century. In silico methods to predict toxicity, such as quantitative structure–activity relationships (QSARs), promote the safe-by-design approach for the development of new materials, including nanomaterials. In this study, a set of cytotoxicity experimental data corresponding to 19 data points for silica nanomaterials were investigated, to compare the widely employed CORAL and Random Forest approaches in terms of their usefulness for developing so-called ‘nano-QSAR’ models. ‘External’ leave-one-out cross-validation (LOO) analysis was performed, to validate the two different approaches. An analysis of variable importance measures and signed feature contributions for both algorithms was undertaken, in order to interpret the models developed. CORAL showed a more pronounced difference between the average coefficient of determination (R2) for training and for LOO (0.83 and 0.65 for training and LOO, respectively), compared to Random Forest (0.87 and 0.78 without bootstrap sampling, 0.90 and 0.78 with bootstrap sampling), which may be due to overfitting. With regard to the physicochemical properties of the nanomaterials, the aspect ratio and zeta potential were found to be the two most important variables for Random Forest, and the average feature contributions calculated for the corresponding descriptors were consistent with the clear trends observed in the data set: less negative zeta potential values and lower aspect ratio values were associated with higher cytotoxicity. In contrast, CORAL failed to capture these trends.
- Subjects :
- Coefficient of determination
Loo
Quantitative Structure-Activity Relationship
Nanotechnology
02 engineering and technology
010501 environmental sciences
Biology
Overfitting
Toxicology
01 natural sciences
General Biochemistry, Genetics and Molecular Biology
QH301
Statistics
Toxicity Tests
Feature (machine learning)
QD
0105 earth and related environmental sciences
Contrast (statistics)
General Medicine
Models, Theoretical
021001 nanoscience & nanotechnology
Silicon Dioxide
Random forest
Nanostructures
Data set
Medical Laboratory Technology
Data point
0210 nano-technology
Subjects
Details
- ISSN :
- 02611929
- Database :
- OpenAIRE
- Journal :
- Scopus-Elsevier
- Accession number :
- edsair.doi.dedup.....c657add482740c1eb9e760a0901baa9c