Back to Search Start Over

Iterated feature selection algorithms with layered recurrent neural network for software fault prediction.

Authors :
Turabieh, Hamza
Mafarja, Majdi
Li, Xiaodong
Source :
Expert Systems with Applications. May2019, Vol. 122, p27-42. 16p.
Publication Year :
2019

Abstract

Highlights • Fault prediction improves the effectiveness of software quality assurance activities. • This paper focuses on building an effective fault prediction classifier. • Fault prediction model using Iterated feature selection algorithms with L-RNN. • We perform experiments on 19 open source projects. • Fault prediction model is best suitable for projects with faulty classes less than the threshold value. Abstract Software fault prediction (SFP) is typically used to predict faults in software components. Machine learning techniques (e.g., classification) are widely used to tackle this problem. With the availability of the huge amount of data that can be obtained from mining software historical repositories, it becomes possible to have some features (metrics) that are not correlated with the faults, which consequently mislead the learning algorithm and thus decrease its performance. One possible solution to eliminate those metrics is Feature Selection (FS). In this paper, a novel FS approach is proposed to enhance the performance of a layered recurrent neural network (L-RNN), which is used as a classification technique for the SFP problem. Three different wrapper FS algorithms (i.e, Binary Genetic Algorithm (BGA), Binary Particle Swarm Optimization (BPSO), and Binary Ant Colony Optimization (BACO)) were employed iteratively. To assess the performance of the proposed approach, 19 real-world software projects from PROMISE repository are investigated and the experimental results are discussed. Receiver operating characteristic - area under the curve (ROC-AUC) is used as a performance measure. The results are compared with other state-of-art approaches including Naïve Bayes (NB), Artificial Neural Network (ANN), logistic regression (LR), the k-nearest neighbors (k-NN) and C4.5 decision trees, in terms of area under the curve (AUC). Our results have demonstrated that the proposed approach can outperform other existing methods. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09574174
Volume :
122
Database :
Academic Search Index
Journal :
Expert Systems with Applications
Publication Type :
Academic Journal
Accession number :
134381079
Full Text :
https://doi.org/10.1016/j.eswa.2018.12.033