Back to Search Start Over

Prediction of the datasets modelability for the building of QSAR classification models by means of the centroid based rivality index.

Authors :
Luque Ruiz, Irene
Gómez-Nieto, Miguel Ángel
Source :
Journal of Mathematical Chemistry. May2019, Vol. 57 Issue 5, p1374-1393. 20p.
Publication Year :
2019

Abstract

The modelability index of a dataset of molecules is a measurement of the capacity of the dataset to be modeled using a QSAR algorithm. This measure allows to predict the correct classification rate of the dataset counting the nearest neighbors to the molecules of the dataset belonging to their same class. In this paper, we propose a new measure for the prediction of the modelability of datasets based on the use of the nearest neighbors based rivality index and the centroids based rivality index. These indexes take into account the noise that the nearest neighbor belonging to a different class could generate in the results of the QSAR classification algorithm. Using thirty benchmark datasets, two types of dataset representation and six different algorithms, we show the excellent behavior of the proposed indexes, obtaining correlations with values of R2 greater than 0.9 between the correct classification rate obtained in the classification processes using five folds cross-validation and the modelability index calculated using the centroid based rivality index. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
02599791
Volume :
57
Issue :
5
Database :
Academic Search Index
Journal :
Journal of Mathematical Chemistry
Publication Type :
Academic Journal
Accession number :
136417013
Full Text :
https://doi.org/10.1007/s10910-018-0972-8