Back to Search Start Over

Android Malware Detection in Bytecode Level Using TF-IDF and XGBoost.

Authors :
Ozogur, Gokhan
Erturk, Mehmet Ali
Aydin, Zeynep Gurkas
Aydin, Muhammed Ali
Source :
Computer Journal; Sep2023, Vol. 66 Issue 9, p2317-2328, 12p
Publication Year :
2023

Abstract

Android is the dominant operating system in the smartphone market and there exists millions of applications in various application stores. The increase in the number of applications has necessitated the detection of malicious applications in a short time. As opposed to dynamic analysis, it is possible to obtain results in a shorter time in static analysis as there is no need to run the applications. However, obtaining various information from application packages using reverse engineering techniques still requires a substantial amount of processing power. Although some attempts have been made to solve this problem by analyzing binary files without decoding the source code, there is still more work to be done in this area. In this study, we analyzed the applications in bytecode level without decoding the binary source files. We proposed a model using Term Frequency - Inverse Document Frequency (TF-IDF) word representation for feature extraction and Extreme Gradient Boosting (XGBoost) method for classification. The experimental results show that our model classifies a given application package as a malware or benign in 2.75 s with 99.05% F1-score on a balanced dataset, and in 3.30 s with 99.35% F1-score on an imbalanced dataset containing obfuscated malwares. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00104620
Volume :
66
Issue :
9
Database :
Complementary Index
Journal :
Computer Journal
Publication Type :
Academic Journal
Accession number :
172001791
Full Text :
https://doi.org/10.1093/comjnl/bxac198