Back to Search Start Over

AIHGAT: A novel method of malware detection and homology analysis using assembly instruction heterogeneous graph.

Authors :
Wang, Runzheng
Gao, Jian
Huang, Shuhua
Source :
International Journal of Information Security. Oct2023, Vol. 22 Issue 5, p1423-1443. 21p.
Publication Year :
2023

Abstract

At present, the trend of familiarization of malicious code is becoming more and more obvious, and the research on the homology of malware (the classification of malicious code family) is of great significance for maintaining network security. In order to better express the overall characteristics of malicious code and improve the effect of detection and homology analysis, this paper proposes a method for detection and homology analysis of malware based on heterogeneous graphs of assembly instructions (AIHGAT). We take the assembly instructions of malicious families as the research object and analyze the importance and correlation of assembly instructions of different malicious families. The malware detection and homology analysis are carried out in three aspects: feature extraction, feature preprocessing, and model construction. In the feature extraction of malicious code, in order to alleviate the problem that it is difficult to extract static features of malicious samples that contain countermeasures such as packing and obfuscation, we obtain binary files from dynamic memory through sandbox and then, analyze its assembly instruction set. In feature preprocessing, we divide the assembly instructions into N-tuples and construct a heterogeneous graph based on assembly instructions according to the internal correlation of the gene sequence composed of the assembly N-grams features. Finally, in terms of model construction, we analyze the homology determination effect of the traditional graph neural network and construct the Graph Attention Network based on residual connection named ResGAT to analyze the homology of malicious code. The experimental results show that the ResGAT can gather the core characteristics of malicious families and enhance the ability to recognize malicious family variants. Our model has an accuracy rate of 98.83%, which is better than traditional machine learning detection methods, and can effectively determine the homology of malicious code families. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
16155262
Volume :
22
Issue :
5
Database :
Academic Search Index
Journal :
International Journal of Information Security
Publication Type :
Academic Journal
Accession number :
172329091
Full Text :
https://doi.org/10.1007/s10207-023-00699-7