Back to Search Start Over

Semantic aware-based instruction embedding for binary code similarity detection.

Authors :
Jia, Yuhao
Yu, Zhicheng
Hong, Zhen
Source :
PLoS ONE. 6/11/2024, Vol. 19 Issue 6, p1-20. 20p.
Publication Year :
2024

Abstract

Binary code similarity detection plays a crucial role in various applications within binary security, including vulnerability detection, malicious software analysis, etc. However, existing methods suffer from limited differentiation in binary embedding representations across different compilation environments, lacking dynamic high-level semantics. Moreover, current approaches often neglect multi-level semantic feature extraction, thereby failing to acquire precise semantic information about the binary code. To address these limitations, this paper introduces a novel detection solution called BinBcla. This method employs an enhanced pre-training model to generate instruction embeddings with dynamic semantics for binary functions. Subsequently, multi-feature fusion technique is utilized to extract local semantic information and long-distance global features from the code, respectively, employing self-attention to comprehend the structure information of the code. Finally, an improved cosine similarity method is employed to learn relationships among all elements of the distance vectors, thereby enhancing the model's robustness to new sample functions. Experiments are conducted across different architectures, compilers, and optimization levels. The results indicate that BinBcla achieves higher accuracy, precision and F1 score compared to existing methods. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
19326203
Volume :
19
Issue :
6
Database :
Academic Search Index
Journal :
PLoS ONE
Publication Type :
Academic Journal
Accession number :
177801988
Full Text :
https://doi.org/10.1371/journal.pone.0305299