1. A novel machine learning approach for detecting first-time-appeared malware.
- Author
-
Shaukat, Kamran, Luo, Suhuai, and Varadharajan, Vijay
- Subjects
- *
DEEP learning , *MACHINE learning , *MALWARE , *FEATURE selection , *WILCOXON signed-rank test , *REVERSE engineering - Abstract
Conventional malware detection approaches have the overhead of feature extraction, the requirement of domain experts, and are time-consuming and resource-intensive. Learning-based approaches are the mainstay of malware detection as they overcome most of these challenges by significantly improving the detection effectiveness and providing a low false positive rate. The exponential growth of malware variants and first-time-appeared malware, which includes polymorphic and zero-day attacks, are some of the significant challenges to learning-based malware detectors. These challenges have catastrophic impacts on the detection effectiveness of these learning-based malware detectors. This paper proposes a novel deep learning-based framework to detect first-time-appeared malware effectively and efficiently by providing better performance than conventional malware detection approaches. First, it translates and visualises each Windows portable executable (PE) file into a coloured image to eliminate the overhead of feature extraction and the need for domain experts to analyse the features. In the subsequent step, a fine-tuned deep learning model is used to extract the deep features from the last fully connected layer. The step has reduced the cost of training required by the deep learning models if used for end-to-end classification. The third step selects the most important and influential features through a powerful feature selection algorithm. The most important features are then fed to a one-class classifier for final detection. With the one-class classifier, an enclosed boundary around the features of benign data is constructed. Anything outside the boundary is declared as an anomaly/malicious. It has enhanced the framework's ability to detect evolving, unseen, polymorphic, and zero-day attacks, as well as reducing the problem of overfitting. The detection effectiveness of the proposed framework is validated with state-of-the-art deep learning models and conventional approaches. The proposed framework has outperformed with an accuracy of 99.30% on the Malimg dataset. The Wilcoxon signed-rank test is used to validate the statistical significance of the proposed framework. It is evident from the results that the proposed framework is effective and can be used in the defence industry, resulting in more powerful and robust solutions against zero-day and polymorphic attacks. [Display omitted] • A novel approach of combining deep learning and machine learning is proposed. First, deep learning is used to extract deep features. The most influential and meticulous features are selected in the subsequent steps to train the machine learning classifier for final detection. The proposed framework eliminates the need for human efforts for reverse engineering tasks. • The proposed framework consists of four steps. In the first step, all PEs are transformed into coloured images. The second step used a deep learning model to extract the deep features. The subsequent step selects the most important features. Finally, the lightweight and most influential features are sent to the final machine learning classifier for final malware detection. • We demonstrate that the proposed framework is lightweight, resilient, efficient and cost-effective. An in-depth analysis is performed to validate the detection effectiveness and generalisation of the proposed framework on multiple datasets. Our results demonstrate that the proposed framework outperformed conventional and state-of-the-art malware detection approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF