Back to Search Start Over

Software analysis through binary function identification

Authors :
Patrick-Evans, James
Publication Year :
2022
Publisher :
Royal Holloway, University of London, 2022.

Abstract

Executable binaries are made up of functional components interacting with each other and the operating system they run on. When high-level source code is compiled into executable binaries, information on the name, size, location, and type of these functional components is included in the executable through the use of symbols. Most software distributed today that is compiled into machine code is released without this symbol information i.e., they are stripped. This makes understanding and analysing binary software very difficult due to the lack of recognisable information in a structured and ordered manner. In this thesis, we propose new techniques used to recover the names of functions in stripped binaries. We explore problems inherent in recovering textual information in the large label space associated with naming functions and develop deep-learning embeddings for both binary functions and their names. Furthermore, we demonstrate how symbol name information can be used to aid the exposure of previously undiscovered software bugs by injecting faults in the high-level logic of client USB kernel drivers. We design a scalable approach for symbol recovery that uses static and symbolic program analysis to extract high-level features from machine code. These features are then used to learn the structure of how binary code and data interact with each other to infer name information from functions in executables. We build a toolkit, DESYL (DEbug Symbol Learning), that is able to modify stripped executable binaries and add symbol information using machine learning models learnt over a very large dataset. Finally, we develop USBDT (USB Driver Testing), our tool for hooking known kernel functions and using selective symbolic execution to analyse Linux USB kernel drivers. Our work extends QEMU to build a software defined virtual USB device used to analyse the Linux USB stack and helped develop two previously unreported mainline Linux kernel zero-day exploits.

Details

Language :
English
Database :
British Library EThOS
Publication Type :
Dissertation/ Thesis
Accession number :
edsble.865214
Document Type :
Electronic Thesis or Dissertation