Back to Search
Start Over
Finding relevant information in big datasets with ML
- Publication Year :
- 2024
-
Abstract
- Due to the abundance of data, noisy, irrelevant, or redundant features often need to be identified and discarded. Feature selection is a collection of methods used to ensure that only relevant data are used for a data analysis task. Extracting and using only useful data for analysis promotes model understanding and performance and reduces the model training time and variance, i.e., overfitting. There is an abundance of methods for feature selection, and they can be categorised by various perspectives and are applicable to differing use cases. In this tutorial, we introduce the feature selection problem and present it from three perspectives of categorisation: search strategy, model reliance, and relevance definition. Furthermore, we propose a guideline for the use of the various methods. Lastly, we discuss current challenges and opportunities for research on feature selection.<br />The project leading to this publication has received funding from 2020 research and innovation programme (grant agreement No 955895).<br />Peer Reviewed<br />Postprint (published version)
Details
- Database :
- OAIster
- Notes :
- 4 p., application/pdf, English
- Publication Type :
- Electronic Resource
- Accession number :
- edsoai.on1452496822
- Document Type :
- Electronic Resource