Back to Search Start Over

A Variable Ranking Method for Machine Learning Models with Correlated Features: In-Silico Validation and Application for Diabetes Prediction.

Authors :
Vettoretti, Martina
Di Camillo, Barbara
Source :
Applied Sciences (2076-3417); Aug2021, Vol. 11 Issue 16, p7740, 19p
Publication Year :
2021

Abstract

Featured Application: The methodology proposed in this paper allows to perform robust variable ranking in statistical learning or machine learning models with highly correlated features. When building a predictive model for predicting a clinical outcome using machine learning techniques, the model developers are often interested in ranking the features according to their predictive ability. A commonly used approach to obtain a robust variable ranking is to apply recursive feature elimination (RFE) on multiple resamplings of the training set and then to aggregate the ranking results using the Borda count method. However, the presence of highly correlated features in the training set can deteriorate the ranking performance. In this work, we propose a variant of the method based on RFE and Borda count that takes into account the correlation between variables during the ranking procedure in order to improve the ranking performance in the presence of highly correlated features. The proposed algorithm is tested on simulated datasets in which the true variable importance is known and compared to the standard RFE-Borda count method. According to the root mean square error between the estimated rank and the true (i.e., simulated) feature importance, the proposed algorithm overcomes the standard RFE-Borda count method. Finally, the proposed algorithm is applied to a case study related to the development of a predictive model of type 2 diabetes onset. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
20763417
Volume :
11
Issue :
16
Database :
Complementary Index
Journal :
Applied Sciences (2076-3417)
Publication Type :
Academic Journal
Accession number :
152111724
Full Text :
https://doi.org/10.3390/app11167740