Back to Search Start Over

Feature mining and classifier selection for API calls-based malware detection.

Authors :
Balan, Gheorghe
Simion, Ciprian-Alin
Gavriluţ, Dragoş Teodor
Luchian, Henri
Source :
Applied Intelligence; Dec2023, Vol. 53 Issue 23, p29094-29108, 15p
Publication Year :
2023

Abstract

This paper deals with a major challenge in cyber-security: the need to respond to ever renewed techniques used by attackers in order to avoid detection based on analysing static features of malware. These constantly renewed techniques consist of various changes in file geometry, entropy a.s.o. As a consequence, static malware features sets describe less and less accurately the malicious files; hence, the performance of machine learning models in detecting new variants of the same malware family may be severely impaired. The paper focuses on a promising approach to this detection challenge: defining file features based on OS (operating system) API (Application Program Interface) calls sequences. We explore in detail the detection potential of such features, since, in order to act maliciously, these features are highly unlikely to be hidden. We studied several tens of thousands of such features, a modest-sized subset of which were subsequently fed to several machine learning models. The database used for training and testing consists of 1.5 million files, including malicious files from the polymorphic families Emotet and Trickbot. Using this database, nearly 4,000 pairings (classifier, feature selection algorithm) were trained / tested. Our experimental results show that the API (Application Program Interface) calls-oriented feature mining process is well suited for detecting polymorphic malware. A comparative discussion of the detection results of the various models is presented; depending on the target optimisation criterion (detection rate / false positive rate / saving resources), three of the 4,000 classification models turn out to be best suited for real-world applications: Random Forrest, Legacy Neural Networks and Decision Tree. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
0924669X
Volume :
53
Issue :
23
Database :
Complementary Index
Journal :
Applied Intelligence
Publication Type :
Academic Journal
Accession number :
173923740
Full Text :
https://doi.org/10.1007/s10489-023-05086-2