Back to Search
Start Over
Research on Malicious Code Analysis Method Based on Semi-supervised Learning
- Source :
- Communications in Computer and Information Science ISBN: 9789811070792
- Publication Year :
- 2017
- Publisher :
- Springer Singapore, 2017.
-
Abstract
- The research on classification method of malicious code is helpful for researchers to understand attack characteristics quickly, and help to reduce the loss of users and even the states. Currently, most of the malware classification methods are based on supervised learning algorithms, but it is powerless for the small number of labeled samples. Therefore, in this paper, we propose a new malware classification method, which is based on semi-supervised learning algorithm. First, we extract the impactful static features and dynamic features to serialize and obtain features of high dimension. Then, we select them with Ensemble Feature Grader consistent with Information Gain, Random Forest and Logistic Regression with \(L_1\) and \(L_2\), and reduce dimension again with PCA. Finally, we use Learning with local and global consistency algorithm with K-means to classify malwares. The experimental results of comparison among SVM, LLGC and K-means + LLGC show that using of the feature extraction, feature reduction and classification method, K-means + LLGC algorithm is superior to LLGC in both classification accuracy and efficiency, the accuracy is increased by 2% to 3%, and the accuracy is more than SVM when the number of labeled samples is small.
- Subjects :
- Theoretical computer science
Computer science
business.industry
Feature extraction
Static program analysis
Pattern recognition
Semi-supervised learning
computer.software_genre
Random forest
Support vector machine
ComputingMethodologies_PATTERNRECOGNITION
Dimension (vector space)
Feature (machine learning)
Malware
Artificial intelligence
business
computer
Subjects
Details
- ISBN :
- 978-981-10-7079-2
- ISBNs :
- 9789811070792
- Database :
- OpenAIRE
- Journal :
- Communications in Computer and Information Science ISBN: 9789811070792
- Accession number :
- edsair.doi...........5bfe7e385844b224b0842861df8d8308