Back to Search Start Over

Autoencoder-kNN meta-model based data characterization approach for an automated selection of AI algorithms

Authors :
Moncef Garouani
Adeel Ahmad
Mourad Bouneffa
Mohamed Hamlich
Source :
Journal of Big Data, Vol 10, Iss 1, Pp 1-18 (2023)
Publication Year :
2023
Publisher :
SpringerOpen, 2023.

Abstract

Abstract The recent evolution of machine learning (ML) algorithms and the high level of expertise required to use them have fuelled the demand for non-experts solutions. The selection of an appropriate algorithm and the configuration of its hyperparameters is among the most complicated tasks while applying ML to new problems. It necessitates well awareness and knowledge of ML algorithms. The algorithm selection problem (ASP) is defined as the process of identifying the algorithm (s) that can deliver top performance for a particular problem, task, and evaluation measure. In this context, meta-learning is one of the approaches to achieve this objective by using prior learning experiences to assist the learning process on unseen problems and tasks. As a data-driven approach, appropriate data characterization is of vital importance for the meta-learning. Nonetheless, the recent literature witness a variety of data characterization techniques including simple, statistical and information theory based measures. However, their quality still needs to be improved. In this paper, a new Autoencoder-kNN (AeKNN) based meta-model with built-in latent features extraction is proposed. The approach is aimed to extract new characterizations of the data, with lower dimensionality but more significant and meaningful features. AeKNN internally uses a deep autoencoder as a latent features extractor from a set of existing meta-features induced from the dataset. From this new features vectors the computed distances are more significant, thus providing a way to accurately recommending top-performing pipelines for previously unseen datasets. In an application on a large-scale hyperparameters optimization task for 400 real world datasets with varying schemas as a meta-learning task, we show that AeKNN offers considerable improvements of the classical kNN as well as traditional meta-models in terms of performance.

Details

Language :
English
ISSN :
21961115
Volume :
10
Issue :
1
Database :
Directory of Open Access Journals
Journal :
Journal of Big Data
Publication Type :
Academic Journal
Accession number :
edsdoj.74763689a38a4a7587adef4a0da8d524
Document Type :
article
Full Text :
https://doi.org/10.1186/s40537-023-00687-7