Back to Search Start Over

Application of Symbolic Classifiers and Multi-Ensemble Threshold Techniques for Android Malware Detection.

Authors :
Anđelić, Nikola
Baressi Šegota, Sandi
Mrzljak, Vedran
Source :
Big Data & Cognitive Computing; Feb2025, Vol. 9 Issue 2, p27, 49p
Publication Year :
2025

Abstract

Android malware detection using artificial intelligence today is a mandatory tool to prevent cyber attacks. To address this problem in this paper the proposed methodology consists of the application of genetic programming symbolic classifier (GPSC) to obtain symbolic expressions (SEs) that can detect if the android is malware or not. To find the optimal combination of GPSC hyperparameter values the random hyperparameter values search method (RHVS) method and the GPSC were trained using 5-fold cross-validation (5FCV). It should be noted that the initial dataset is highly imbalanced (publicly available dataset). This problem was addressed by applying various preprocessing and oversampling techniques thus creating a huge number of balanced dataset variations and on each dataset variation the GPSC was trained. Since the dataset has many input variables three different approaches were considered: the initial investigation with all input variables, input variables with high feature importance, application of principal component analysis. After the SEs with the highest classification performance were obtained they were used in threshold-based voting ensembles and the threshold values were adjusted to improve classification performance. Multi-TBVE has been developed and using them the robust system for Android malware detection was achieved with the highest accuracy of 0.98 was obtained. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
25042289
Volume :
9
Issue :
2
Database :
Complementary Index
Journal :
Big Data & Cognitive Computing
Publication Type :
Academic Journal
Accession number :
183329785
Full Text :
https://doi.org/10.3390/bdcc9020027