Back to Search
Start Over
BinGold: Towards robust binary analysis by extracting the semantics of binary code as semantic flow graphs (SFGs)
- Source :
- Digital Investigation. :S11-S22
- Publisher :
- The Author(s). Published by Elsevier Ltd.
-
Abstract
- Binary analysis is useful in many practical applications, such as the detection of malware or vulnerable software components. However, our survey of the literature shows that most existing binary analysis tools and frameworks rely on assumptions about specific compilers and compilation settings. It is well known that techniques such as refactoring and light obfuscation can significantly alter the structure of code, even for simple programs. Applying such techniques or changing the compiler and compilation settings can significantly affect the accuracy of available binary analysis tools, which severely limits their practicability, especially when applied to malware. To address these issues, we propose a novel technique that extracts the semantics of binary code in terms of both data and control flow. Our technique allows more robust binary analysis because the extracted semantics of the binary code is generally immune from light obfuscation, refactoring, and varying the compilers or compilation settings. Specifically, we apply data-flow analysis to extract the semantic flow of the registers as well as the semantic components of the control flow graph, which are then synthesized into a novel representation called the semantic flow graph (SFG). Subsequently, various properties, such as reflexive, symmetric, antisymmetric, and transitive relations, are extracted from the SFG and applied to binary analysis. We implement our system in a tool called BinGold and evaluate it against thirty binary code applications. Our evaluation shows that BinGold successfully determines the similarity between binaries, yielding results that are highly robust against light obfuscation and refactoring. In addition, we demonstrate the application of BinGold to two important binary analysis tasks: binary code authorship attribution, and the detection of clone components across program executables. The promising results suggest that BinGold can be used to enhance existing techniques, making them more robust and practical.
- Subjects :
- Theoretical computer science
Computer science
02 engineering and technology
Assembly instructions
computer.software_genre
Binary Analysis
020204 information systems
0202 electrical engineering, electronic engineering, information engineering
Binary relation
Reverse engineering
Data-flow analysis
020207 software engineering
computer.file_format
Computer Science Applications
Obfuscation (software)
Medical Laboratory Technology
Semantic features
Code refactoring
Semantic flow graph
Control flow graph
Binary code
Executable
Compiler
computer
Law
Data flow analysis
Subjects
Details
- Language :
- English
- ISSN :
- 17422876
- Database :
- OpenAIRE
- Journal :
- Digital Investigation
- Accession number :
- edsair.doi.dedup.....496ad542d39264acdce72e7303cf6488
- Full Text :
- https://doi.org/10.1016/j.diin.2016.04.002