Back to Search
Start Over
A Comparative Study of Class Rebalancing Methods for Security Bug Report Classification
- Source :
- IEEE Transactions on Reliability. 70:1658-1670
- Publication Year :
- 2021
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2021.
-
Abstract
- Identifying security bug reports (SBRs) accurately from a bug repository can reduce a software product’s security risk. However, the class imbalance problem exists for SBR prediction since the number of SBRs is often limited, and this issue has not been thoroughly investigated in previous studies. In our study, we choose six real-world projects of different sizes with over 120 000 bug reports in total as our empirical subjects. We first analyze the impact of the class imbalance issue on SBR prediction and confirm its negative impact on prediction performance. Then we perform a comparative study of six state-of-the-art class rebalancing methods combined with five popular classification algorithms for SBR prediction. By comparing with the baseline method Farsec, using the class rebalancing methods can improve the performance in 78% of cases in the worst case. Moreover, the combination of the Rose and random forest classification algorithm can construct the model with the best performance, which increases the performance by 267% in the best case and 75% on average in terms of F1-score . Finally, we summarize eight main findings based on our empirical studies’ results, which can provide guidelines for choosing appropriate class rebalancing methods and classifiers for SBR prediction in practice.
- Subjects :
- Operations Research
Security bug
Computer science
business.industry
Machine learning
computer.software_genre
Class (biology)
Random forest
0803 Computer Software, 0906 Electrical and Electronic Engineering
Class imbalance
Statistical classification
Software
Empirical research
Artificial intelligence
Electrical and Electronic Engineering
Safety, Risk, Reliability and Quality
business
Baseline (configuration management)
computer
Subjects
Details
- ISSN :
- 15581721 and 00189529
- Volume :
- 70
- Database :
- OpenAIRE
- Journal :
- IEEE Transactions on Reliability
- Accession number :
- edsair.doi.dedup.....887080e7531ada41bde0fa4f814fec9a
- Full Text :
- https://doi.org/10.1109/tr.2021.3118026