Descriptor: "Software Metrics" / Publisher: springer nature - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Software Metrics"' showing total 139 results

Start Over Descriptor "Software Metrics" Publisher springer nature

139 results on '"Software Metrics"'

1. Improving accuracy of code smells detection using machine learning with data balancing techniques.

Author: Khleel, Nasraldeen Alnor Adam and Nehéz, Károly
Subjects: *RECEIVER operating characteristic curves, *SOFTWARE failures, *COMPUTER software quality control, *SOFTWARE measurement, *DEEP learning
Abstract: Code smells indicate potential symptoms or problems in software due to inefficient design or incomplete implementation. These problems can affect software quality in the long-term. Code smell detection is fundamental to improving software quality and maintainability, reducing software failure risk, and helping to refactor the code. Previous works have applied several prediction methods for code smell detection. However, many of them show that machine learning (ML) and deep learning (DL) techniques are not always suitable for code smell detection due to the problem of imbalanced data. So, data imbalance is the main challenge for ML and DL techniques in detecting code smells. To overcome these challenges, this study aims to present a method for detecting code smell based on DL algorithms (Bidirectional Long Short-Term Memory (Bi-LSTM) and Gated Recurrent Unit (GRU)) combined with data balancing techniques (random oversampling and Tomek links) to mitigate data imbalance issue. To establish the effectiveness of the proposed models, the experiments were conducted on four code smells datasets (God class, data Class, feature envy, and long method) extracted from 74 open-source systems. We compare and evaluate the performance of the models according to seven different performance measures accuracy, precision, recall, f-measure, Matthew's correlation coefficient (MCC), the area under a receiver operating characteristic curve (AUC), the area under the precision–recall curve (AUCPR) and mean square error (MSE). After comparing the results obtained by the proposed models on the original and balanced data sets, we found out that the best accuracy of 98% was obtained for the Long method by using both models (Bi-LSTM and GRU) on the original datasets, the best accuracy of 100% was obtained for the long method by using both models (Bi-LSTM and GRU) on the balanced datasets (using random oversampling), and the best accuracy 99% was obtained for the long method by using Bi-LSTM model and 99% was obtained for the data class and Feature envy by using GRU model on the balanced datasets (using Tomek links). The results indicate that the use of data balancing techniques had a positive effect on the predictive accuracy of the models presented. The results show that the proposed models can detect the code smells more accurately and effectively. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

2. Software dependability analysis under neutrosophic environment using optimized Elman recurrent neural network-based classification algorithm and Mahalanobis distance-based ranking algorithm.

Author: Chatterjee, Subhashis and Saha, Deepjyoti
Subjects: *RECURRENT neural networks, *COMPUTER software developers, *SOFTWARE measurement, *CLASSIFICATION algorithms, *SYSTEMS software
Abstract: Dependability of software systems is one of the challenging issues for software developers. Main software dependability issues include reliability, security, performability, availability, maintainability, and aging. Software becomes non-dependable due to the overconfidence of developers, lack of knowledge about dependability issues, or ignorance of dependability attributes during software development. Classification and ranking of these non-dependable software modules based on above-mentioned dependability attribute values in early phase are the main aspects of this article. Hence, computation of dependability attribute values becomes a primitive concern here. The values of software dependability attributes depend on various software metrics like: requirement stability, cyclomatic complexity, essential complexity, lines of code, and so on. Neutrosophic inference system (NIS) has been used here to compute the values of dependability attributes accurately, reducing incompleteness, indeterminacy, and impreciseness from metric values by incorporating expert knowledge. An Elman Recurrent Neural Network (ERNN)-based algorithm has been proposed here based on predicted dependability attribute values to classify dependable and non-dependable software modules. Backpropagation algorithm and Genetic Algorithm are used during training of ERNN. Mahalanobis distance (MD) is used to rank software modules based on dependability attributes at early phase of development. This entire process of dependability analysis will help to optimize resource utilization, development cost, and meet the target release time. Different comparison criteria are used to compare the effectiveness of the proposed model with some existing models based on four datasets. Performance analysis demonstrates effectiveness and usefulness for identifying and ranking the non-dependable software modules during early phase of development. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

3. A systematic literature review of solutions for cold start problem.

Author: Singh, Neetu and Singh, Sandeep Kumar
Abstract: Insufficient knowledge about a new bug or a new developer, in the context of recommendations done in software bug repositories (SBR) mining, impacts the recommender-system performance and gives rise to a cold start problem (CSP). Many recent cold start solutions based on machine learning in general, and specifically on reinforcement and deep learning, have been published, but the insights from these works are not presented comprehensively and remain scattered, as a result, it is difficult for budding researchers to conclude further enhancements. Also, there is a lack of a survey covering both ML and RL-based solutions for CSP under one hood. So, to bridge these gaps, this article presents a critical review using the PRISMA model. Both ML and RL-based solutions for Cold start problems have been presented in this model through a well-defined taxonomy along with its detailed bibliometric analysis. This article provides 78 significant primary studies published from 2012 to 2022. Findings from this review indicate that different solution strategies based on MABs as well as CMABs, need to be designed for handling cold start settings in the bug and developer context. Moreover, there is a great scope for performance improvement in the state-of-the-art solutions by either improving the accuracy, feature engineering integration, different process metrics exploration, or hyper-parameter tuning. This review will give directions to novice researchers, academicians, and practitioners to work ahead on the issues identified in this contemporary challenging problem. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

4. Software defect prediction using a bidirectional LSTM network combined with oversampling techniques.

Author: Khleel, Nasraldeen Alnor Adam and Nehéz, Károly
Subjects: *SOFTWARE measurement, *RECURRENT neural networks, *COMPUTER software quality control, *SYSTEM failures, *COMPUTER software testing, *COMPUTER software
Abstract: Software defects are a critical issue in software development that can lead to system failures and cause significant financial losses. Predicting software defects is a vital aspect of ensuring software quality. This can significantly impact both saving time and reducing the overall cost of software testing. During the software defect prediction (SDP) process, automated tools attempt to predict defects in the source codes based on software metrics. Several SDP models have been proposed to identify and prevent defects before they occur. In recent years, recurrent neural network (RNN) techniques have gained attention for their ability to handle sequential data and learn complex patterns. Still, these techniques are not always suitable for predicting software defects due to the problem of imbalanced data. To deal with this problem, this study aims to combine a bidirectional long short-term memory (Bi-LSTM) network with oversampling techniques. To establish the effectiveness and efficiency of the proposed model, the experiments have been conducted on benchmark datasets obtained from the PROMISE repository. The experimental results have been compared and evaluated in terms of accuracy, precision, recall, f-measure, Matthew's correlation coefficient (MCC), the area under the ROC curve (AUC), the area under the precision-recall curve (AUCPR) and mean square error (MSE). The average accuracy of the proposed model on the original and balanced datasets (using random oversampling and SMOTE) was 88%, 94%, And 92%, respectively. The results showed that the proposed Bi-LSTM on the balanced datasets (using random oversampling and SMOTE) improves the average accuracy by 6 and 4% compared to the original datasets. The average F-measure of the proposed model on the original and balanced datasets (using random oversampling and SMOTE) were 51%, 94%, And 92%, respectively. The results showed that the proposed Bi-LSTM on the balanced datasets (using random oversampling and SMOTE) improves the average F-measure by 43 and 41% compared to the original datasets. The experimental results demonstrated that combining the Bi-LSTM network with oversampling techniques positively affects defect prediction performance in datasets with imbalanced class distributions. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. A dissection of agile software development in changing scenario and the sustainable path ahead.

Author: Chakravarty, Krishna and Singh, Jagannath
Abstract: In the recent years some unprecedented changes in business processes have been observed due to pandemic. The world has experienced the power of information technology to sail through this testing times. A huge number of software have been created and used during this pandemic. Agile is the most popular software development methodology in the current decade and its key values are frequent interactions, development of working software, customer collaboration and response to change. These values are heavily dependent on interactions among key stakeholders like developers, testers, scrum masters and customers. Now due to pandemic situation, software development process itself has undergone a few key changes i.e. shifting the workspace to home, restricted travels, challenges in collaborative work in global platform, etc. The research work takes a deep dive into the impact of COVID-19 specially on the health of the software, work environment factors including challenges and survival strategies. The health of a software is investigated by software metrics like productivity, customer satisfaction, defect counts. The work environment aspects include employee motivation, work life balance, ease of global team management, etc. A survey is conducted to collect feedback from more than 4000 IT professionals working on 177 different projects. Finally the results are analyzed both quantitatively and qualitatively and a sustainable future path is suggested with mixed mode of work. To verify the strength and internal consistency of the survey, Cronbach Alpha coefficient method is implemented. The suggested future mode of work is also validated by evidence of implementation of revised work policies in different organization across the globe. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

6. Understanding the effect of batch refactoring on software quality.

Author: Agnihotri, Mansi and Chug, Anuradha
Abstract: Developers aim to create software with the least possible flaws and good quality. Hence, they apply a sequence of program transformation operations known as refactoring to create such software. Refactoring is a widely known code restructuring technique that enhances the program's structure without modifying its functionality. Internal properties of a system are generally measured using software metrics that help to understand both qualitative and quantitative aspects of the software. The authors explored the interrelationship between batch refactorings and software metrics in this study. Batch refactorings are a set of multiple inter-related refactoring operations that a developer applies to achieve the desired quality of the system. The refactorings have been extracted from different versions of three open-source datasets, Thumbnailator, Mp3agic, and Tabula-java. The extracted refactorings are studied under five categories of refactoring batches, i.e., class, method, field, parameter, and attribute. In addition, seven prominent software metrics have been used to investigate the impact of batch refactorings on critical software metrics and software quality. A software metric is considered critical if it does not lie within the set threshold value of the respective metric. The findings of the study show that 43% of the batch refactorings belong to method-level batches. Approximately 74.3% of the batch refactorings have been applied to the classes that comprise at least one critical software metric. Also, on an average method level batches improved the metrics by 27.7% and thus outperformed other batches in improving the overall quality of the software. The results of this study help in better understanding the relationship between batch refactoring and software quality, and thus the developers can make an appropriate decision in choosing the best refactoring batches for achieving the maximum software quality. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

7. Machine Learning-Based Exploration of the Impact of Move Method Refactoring on Object-Oriented Software Quality Attributes.

Author: Al Dallal, Jehad, Abdulsalam, Hanady, AlMarzouq, Mohammad, and Selamat, Ali
Subjects: *SOFTWARE refactoring, *COMPUTER software quality control, *STATISTICAL learning, *MACHINE learning, *SOURCE code
Abstract: Refactoring is a maintenance task that aims at enhancing the quality of a software's source code by restructuring it without affecting the external behavior. Move method refactoring (MMR) involves reallocating a method by moving it from one class to the class in which the method is used most. Several studies have been performed to explore the impact of MMR on several quality attributes. However, these studies have several limitations related to the applied approaches, considered quality attributes, and size of the selected datasets. This paper reports an empirical study that applies statistical and machine learning (ML) approaches to explore the impact of MMR on code quality. The study overcame the limitations of the existing studies, and this improvement is expected to make the results of this study more reliable and trustworthy. We considered eight quality attributes and thirty quality measures, and a total of approximately 4 K classes from seven Java open-source systems were involved in the study. The results provide evidence that most of the quality attributes were significantly improved by MMR in most cases. In addition, the results show that a limited number of measures, when considered individually, have a significant ability to predict MMR, whereas most of the considered measures, when considered together, significantly contribute to the MMR prediction model. The constructed ML-based prediction model has an area under curve (AUC) value of 96.6%. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

8. A generalized approach to construct node probability table for Bayesian belief network using fuzzy logic.

Author: Kumar, Chandan, Jha, Sudhanshu Kumar, Yadav, Dilip Kumar, Prakash, Shiv, and Prasad, Mukesh
Subjects: *BAYESIAN analysis, *FUZZY logic, *PROBABILITY theory, *COMPUTER software development, *SOFTWARE architecture
Abstract: The cause–effect relationship has tremendous role in interpreting the engineering and scientific problems which basically deals with the identifying potential causes of problem. Bayesian belief networks (BBN) also referred as Bayesian casual probabilistic network used widely to deal with probabilistic events to elucidate the complications having uncertainty. A major challenge in BBN is to construct a node probability table (NPT), which grows exponentially with the rising number of variables. Various approaches exist for NPT construction, including expert elicitation, data analysis, survey and weighted functions, noisy-OR, noisy-MAX, recursive noisy-OR (ROR), extended recursive noisy-OR, and ranked nodes. However, these methods are problem-specific and lacking behind a generalized approach applicable to all problem types. To address this issue, this paper proposes a generalized universal approach for constructing the NPT using fuzzy logic. The suggested strategy has been validated by applying it to a BBN prototype for software design and development. The proposed strategy has been evaluated with best-case and worst-case software metrics. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

9. Quantifying charismatic quality parameters of MAMQ model using fuzzy logic for web development.

Author: Kumar, Nimish
Abstract: Software measures play an eminent role in the evaluation of quality of a web-based application. As people increasingly rely on web-based applications, the importance of examining web quality parameters has escalated. Customer acceptability is low when a low-quality web application fails to meet the implicit and explicit expectations of customers. Many frameworks for measuring quality criteria of web-based applications have been proposed so far, but they all fall short in some way. The majority of existing frameworks are either limited to a specific web-based application perspective or deal with a limited set of quality attributes. Based on the constraints of current frameworks, this paper offers the multi-attribute quality model (MAQM), a generic conceptual object-oriented framework for quantifying non-functional aspects of web development. This paper quantifies charismatic quality parameters which is non-functional parameter and sub-characteristic under web visitor's perspective of MAQM model, using fuzzy logic. The approach is chosen due to the heterogeneous and unpredictable nature of the web applications. Charismatic non-functional parameters are the important feature in customer entrenchment towards any web-based application. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

10. Identification and analysis of change ripples in object-oriented software applications.

Author: Singh, R K and Agrawal, Anushree
Abstract: Software development and maintenance accompany several challenges related to change management. Identifying dependencies of change-prone classes helps to manage the after-effects of changes smoothly. This paper aims to study the ripple effect identification in object-oriented software applications using software metrics and change history. The changeability pattern is generated and compared with actual changes to validate the effectiveness of the proposed approach for ripple effect identification. The impact set of existing classes is derived using the change history with a commit weight-based approach. Two coupling measures, Likelihood of Change (LiCh) and Co-change Probability (CChPr), are derived to analyse the change impact set of existing classes. The change impact of new classes is derived using a Bagging classification technique. The source code metrics are independent variables and co-change derived from change history is the dependent variable for the prediction model. The results indicate that most dependent classes are identified using the proposed technique and advocate using software metrics and change history for ripple effect identification. It can be beneficial for software practitioners to understand the impact of change and identify dependencies of an explicit class. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

11. A novel approach for software defect prediction using CNN and GRU based on SMOTE Tomek method.

Author: Khleel, Nasraldeen Alnor Adam and Nehéz, Károly
Subjects: CONVOLUTIONAL neural networks, MACHINE learning, COMPUTER software quality control, DEEP learning, COMPUTER software
Abstract: Software defect prediction (SDP) plays a vital role in enhancing the quality of software projects and reducing maintenance-based risks through the ability to detect defective software components. SDP refers to using historical defect data to construct a relationship between software metrics and defects via diverse methodologies. Several prediction models, such as machine learning (ML) and deep learning (DL), have been developed and adopted to recognize software module defects, and many methodologies and frameworks have been presented. Class imbalance is one of the most challenging problems these models face in binary classification. However, When the distribution of classes is imbalanced, the accuracy may be high, but the models cannot recognize data instances in the minority class, leading to weak classifications. So far, little research has been done in the previous studies that address the problem of class imbalance in SDP. In this study, the data sampling method is introduced to address the class imbalance problem and improve the performance of ML models in SDP. The proposed approach is based on a convolutional neural network (CNN) and gated recurrent unit (GRU) combined with a synthetic minority oversampling technique plus the Tomek link (SMOTE Tomek) to predict software defects. To establish the efficiency of the proposed models, the experiments have been conducted on benchmark datasets obtained from the PROMISE repository. The experimental results have been compared and evaluated in terms of accuracy, precision, recall, F-measure, Matthew's correlation coefficient (MCC), the area under the ROC curve (AUC), the area under the precision-recall curve (AUCPR), and mean square error (MSE). The experimental results showed that the proposed models predict the software defects more effectively on the balanced datasets than the original datasets, with an improvement of up to 19% for the CNN model and 24% for the GRU model in terms of AUC. We compared our proposed approach with existing SDP approaches based on several standard performance measures. The comparison results demonstrated that the proposed approach significantly outperforms existing state-of-the-art SDP approaches on most datasets. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

12. Computational intelligence in software defects rules discovery.

Author: Vescan, Andreea, Şerban, Camelia, and Crişan, Gloria Cerasela
Subjects: *DISCOVERY (Law), *ANT colonies, *SYSTEMS software, *COMPUTER software, *SOFTWARE measurement, *COMPUTATIONAL intelligence
Abstract: Nowadays, due to the constant increase in size and complexity of the software systems imposed by their evolution, developing qualitative software systems becomes a highly important task. To achieve this goal, early detection of software defects is a must. The paper proposes an approach to generate rules for software defect prediction. In this respect, a Software Defects Rules Discovery (SDRD) algorithm was put forward. This one uses the ant colony system method to discover the best solution based on code metrics values. We conducted 20 experiments in total (five experiments with three metrics and 15 experiments with combinations of two metrics). The results revealed that the metrics that correlate with the dependent variable are CBO (Coupling Between Objects), RFC (Response For a Class) and NPM (Number of Private Methods), and that from all the combinations of two metrics, for the five projects, the best obtained rule is formed with RFC and NPM metrics. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

13. Cross project defect prediction: a comprehensive survey with its SWOT analysis.

Author: Khatri, Yogita and Singh, Sandeep Kumar
Abstract: Software fault prediction (SFP) refers to the process of identifying (or predicting) faulty modules based on its characteristics/software metrics. SFP can be done either using the same project data in both the training and testing phase i.e. within project defect prediction or using a different one, as done in cross-project defect prediction (CPDP). Previous works show that contemporary research in this field is progressing towards CPDP. To present the current state of progress and the future prospects of CPDP, this article presents a comprehensive survey of CPDP considering the latest work along with its SWOT analysis. This survey is targeted to present the novice researchers, academicians, and practitioners with the alphas and omegas of this contemporary challenging field. We have also carried a qualitative and quantitative evaluation of CPDP w.r.t some of the targeted research questions. A total of 34 significant primary CPDP studies published from 2008 to 2019 were selected. Both qualitative and quantitative data are extracted from each study. The collected data is then consolidated and analyzed to present a comprehensive report showing the current state of the art, along with the answers to the targeted research questions and finally the CPDP SWOT analysis. We observed that there exists a big scope for performance improvement in CPDP. Integration of feature engineering, exploration with different process metrics, hyperparameter tuning, class imbalance handling in CPDP setting are some of the ways identified for bringing enhancement in CPDP performance. Apart from this, we would like to conclude that there is a strong need to investigate Precision over the Recall and model's validity in terms of effort/cost-effectiveness. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

14. EkmEx - an extended framework for labeling an unlabeled fault dataset.

Author: Rizwan, Muhammad, Nadeem, Aamer, Sarwar, Sohail, Iqbal, Muddesar, Safyan, Muhammad, and Qayyum, Zia Ul
Subjects: COMPUTER software quality control, SOFTWARE measurement
Abstract: Software fault prediction (SFP) is a quality assurance process that identifies if certain modules are fault-prone (FP) or not-fault-prone (NFP). Hence, it minimizes the testing efforts incurred in terms of cost and time. Supervised machine learning techniques have capacity to spot-out the FP modules. However, such techniques require fault information from previous versions of software product. Such information, accumulated over the life-cycle of software, may neither be readily available nor reliable. Currently, clustering with experts' opinions is a prudent choice for labeling the modules without any fault information. However, the asserted technique may not fully comprehend important aspects such as selection of experts, conflict in expert opinions, catering the diverse expertise of domain experts etc. In this paper, we propose a comprehensive framework named EkmEx that extends the conventional fault prediction approaches while providing mathematical foundation through aspects not addressed so far. The EkmEx guides in selection of experts, furnishes an objective solution for resolve of verdict-conflicts and manages the problem of diversity in expertise of domain experts. We performed expert-assisted module labeling through EkmEx and conventional clustering on seven public datasets of NASA. The empirical outcomes of research exhibit significant potential of the proposed framework in identifying FP modules across all seven datasets. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

15. Refactoring for reuse: an empirical study.

Author: Alomar, Eman Abdullah, Wang, Tianjia, Raut, Vaibhavi, Mkaouer, Mohamed Wiem, Newman, Christian, and Ouni, Ali
Abstract: Refactoring is the de-facto practice to optimize software health. While several studies propose refactoring strategies to optimize software design through applying design patterns and removing design defects, little is known about how developers actually refactor their code to improve its reuse. Therefore, we extract, from 1,828 open source projects, a set of refactorings that were intended to improve the software reusability. We analyze the impact of reusability refactorings on the state-of-the-art reusability metrics, and we compare the distribution of reusability refactoring types, with the distribution of the remaining mainstream refactorings. Overall, we found that the distribution of refactoring types, applied in the context of reusability, is different from the distribution of refactoring types in mainstream development. In the refactorings performed to improve reusability, source files are subject to more design level types of refactorings. Reusability refactorings significantly impact, high-level code elements, such as packages, classes, and methods, while typical refactorings, impact all code elements, including identifiers, and parameters. These findings provide practical insights into the current practice of refactoring in the context of code reuse involving the act of refactoring. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

16. Copula-based software metrics aggregation.

Author: Ulan, Maria, Löwe, Welf, Ericsson, Morgan, and Wingkvist, Anna
Subjects: SOFTWARE measurement, ABSOLUTE value, SYSTEMS software, EMPIRICAL research, DECISION making, MAINTAINABILITY (Engineering)
Abstract: A quality model is a conceptual decomposition of an abstract notion of quality into relevant, possibly conflicting characteristics and further into measurable metrics. For quality assessment and decision making, metrics values are aggregated to characteristics and ultimately to quality scores. Aggregation has often been problematic as quality models do not provide the semantics of aggregation. This makes it hard to formally reason about metrics, characteristics, and quality. We argue that aggregation needs to be interpretable and mathematically well defined in order to assess, to compare, and to improve quality. To address this challenge, we propose a probabilistic approach to aggregation and define quality scores based on joint distributions of absolute metrics values. To evaluate the proposed approach and its implementation under realistic conditions, we conduct empirical studies on bug prediction of ca. 5000 software classes, maintainability of ca. 15000 open-source software systems, and on the information quality of ca. 100000 real-world technical documents. We found that our approach is feasible, accurate, and scalable in performance. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

17. An empirical study toward dealing with noise and class imbalance issues in software defect prediction.

Author: Pandey, Sushant Kumar and Tripathi, Anil Kumar
Subjects: *RECEIVER operating characteristic curves, *NOISE, *COMPUTER software
Abstract: The quality of the defect datasets is a critical issue in the domain of software defect prediction (SDP). These datasets are obtained through the mining of software repositories. Recent studies claim over the quality of the defect dataset. It is because of inconsistency between bug/clean fix keyword in fault reports and the corresponding link in the change management logs. Class Imbalance (CI) problem is also a big challenging issue in SDP models. The defect prediction method trained using noisy and imbalanced data leads to inconsistent and unsatisfactory results. Combined analysis over noisy instances and CI problem needs to be required. To the best of our knowledge, there are insufficient studies that have been done over such aspects. In this paper, we deal with the impact of noise and CI problem on five baseline SDP models; we manually added the various noise level (0–80%) and identified its impact on the performance of those SDP models. Moreover, we further provide guidelines for the possible range of tolerable noise for baseline models. We have also suggested the SDP model, which has the highest noise tolerable ability and outperforms over other classical methods. The True Positive Rate (TPR) and False Positive Rate (FPR) values of the baseline models reduce between 20–30% after adding 10–40% noisy instances. Similarly, the ROC (Receiver Operating Characteristics) values of SDP models reduce to 40–50%. The suggested model leads to avoid noise between 40–60% as compared to other traditional models. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

18. Software Fault Prediction Using LSSVM with Different Kernel Functions.

Author: Kulamala, Vinod Kumar, Kumar, Lov, and Mohapatra, Durga Prasad
Subjects: *KERNEL functions, *COMPUTER software quality control, *RADIAL basis functions, *SUPPORT vector machines, *COMPUTER software development, *SOFTWARE measurement
Abstract: Software fault prediction is a process, which helps to identify fault prone modules in early stages of software development. It also helps in improving the software quality with optimized effort and cost. Least Square Support Vector Machines (LSSVM) have been explored in problems related to classification. The aim of this paper is to develop and compare, software fault prediction models using LSSVM with Linear, Polynomial and Radial Basis Function (RBF) kernels. The proposed models classify a software module as faulty or non faulty by taking software metrics such as Halstead software metrics as input. Experiments on fifteen open source projects are performed to study the impact of the proposed models. The models are evaluated using Accuracy, F-measure and ROC AUC as the performance measures. The experimental results shows that, LSSVM with polynomial kernel perform better than LSSVM with linear kernel and similar to RBF kernel, and the models developed using LSSVM improve the prediction accuracy of software fault prediction, compared to the most frequently used models. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

19. Weighted software metrics aggregation and its application to defect prediction.

Author: Ulan, Maria, Löwe, Welf, Ericsson, Morgan, and Wingkvist, Anna
Abstract: It is a well-known practice in software engineering to aggregate software metrics to assess software artifacts for various purposes, such as their maintainability or their proneness to contain bugs. For different purposes, different metrics might be relevant. However, weighting these software metrics according to their contribution to the respective purpose is a challenging task. Manual approaches based on experts do not scale with the number of metrics. Also, experts get confused if the metrics are not independent, which is rarely the case. Automated approaches based on supervised learning require reliable and generalizable training data, a ground truth, which is rarely available. We propose an automated approach to weighted metrics aggregation that is based on unsupervised learning. It sets metrics scores and their weights based on probability theory and aggregates them. To evaluate the effectiveness, we conducted two empirical studies on defect prediction, one on ca. 200 000 code changes, and another ca. 5 000 software classes. The results show that our approach can be used as an agnostic unsupervised predictor in the absence of a ground truth. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

20. Exploratory study of the impact of project domain and size category on the detection of the God class design smell.

Author: Alkharabsheh, Khalid, Crespo, Yania, Fernández-Delgado, Manuel, Viqueira, José R., and Taboada, José A.
Subjects: COMPUTER software quality control, MACHINE learning, GOD, SOFTWARE measurement, MAINTAINABILITY (Engineering)
Abstract: Design smell detection has proven to be an efficient strategy to improve software quality and consequently decrease maintainability expenses. This work explores the influence of the information about project context expressed as project domain and size category information, on the automatic detection of the god class design smell by machine learning techniques. A set of experiments using eight classifiers to detect god classes was conducted on a dataset containing 12, 587 classes from 24 Java projects. The results show that classifiers change their behavior when they are used on datasets that differ in these kinds of project information. The results show that god class design smell detection can be improved by feeding machine learning classifiers with this project context information. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

21. Threshold estimation from software metrics by using evolutionary techniques and its proposed algorithms, models.

Author: Padhy, Neelamadhab, Panigrahi, Rasmita, and Neeraja, K.
Abstract: The software metrics play the important role in the software industry. As the software industry growing in size and complexity enhanced support is mandatory for computing and managing the software quality. Quality measurement is one of the key features of the manager in the software industry; where threshold plays the crucial role. Software measurement is necessary by means for evaluating different quality attributes and characteristics, such as size, complexity, maintainability, and usability. Instead of that effective and efficient software system is straightforward dependent on the meaning of suitable thresholds. The objective of this paper is to estimate the threshold values from software metrics by using novel evolutionary intelligence techniques. The threshold and aging software design optimization algorithms and models to prevent software aging by using machine learning (evolutionary algorithms). Apart from the above-mentioned techniques, this paper also proposed a novel threshold estimation, aging, and survivability aware (sensitive) reusability optimization model of an object-oriented software system. To expand firmness, aging and survivability aware (sensitive) optimization threshold scheme aging prediction and software rejuvenation model and algorithms proposed. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

22. Extracting rules for vulnerabilities detection with static metrics using machine learning.

Author: Gupta, Aakanshi, Suri, Bharti, Kumar, Vijay, and Jain, Pragyashree
Abstract: Software quality is the prime solicitude in software engineering and vulnerability is one of the major threat in this respect. Vulnerability hampers the security of the software and also impairs the quality of the software. In this paper, we have conducted experimental research on evaluating the utility of machine learning algorithms to detect the vulnerabilities. To execute this experiment; a set of software metrics was extracted using machine learning in the form of easily accessible laws. Here, 32 supervised machine learning algorithms have been considered for 3 most occurred vulnerabilities namely: Lawofdemeter, BeanMemberShouldSerialize,and LocalVariablecouldBeFinal in a software system. Using the J48 machine learning algorithm in this research, up to 96% of accurate result in vulnerability detection was achieved. The results are validated against tenfold cross validation and also, the statistical parameters like ROC curve, Kappa statistics; Recall, Precision, etc. have been used for analyzing the result. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

23. A longitudinal study of static analysis warning evolution and the effects of PMD on software quality in Apache open source projects.

Author: Trautsch, Alexander, Herbold, Steffen, and Grabowski, Jens
Subjects: OPEN source software, COMPUTER software quality control, COMPUTER software development, EMPIRICAL research, SOFTWARE engineering
Abstract: Automated static analysis tools (ASATs) have become a major part of the software development workflow. Acting on the generated warnings, i.e., changing the code indicated in the warning, should be part of, at latest, the code review phase. Despite this being a best practice in software development, there is still a lack of empirical research regarding the usage of ASATs in the wild. In this work, we want to study ASAT warning trends in software via the example of PMD as an ASAT and its usage in open source projects. We analyzed the commit history of 54 projects (with 112,266 commits in total), taking into account 193 PMD rules and 61 PMD releases. We investigate trends of ASAT warnings over up to 17 years for the selected study subjects regarding changes of warning types, short and long term impact of ASAT use, and changes in warning severities. We found that large global changes in ASAT warnings are mostly due to coding style changes regarding braces and naming conventions. We also found that, surprisingly, the influence of the presence of PMD in the build process of the project on warning removal trends for the number of warnings per lines of code is small and not statistically significant. Regardless, if we consider defect density as a proxy for external quality, we see a positive effect if PMD is present in the build configuration of our study subjects. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

24. Code and commit metrics of developer productivity: a study on team leaders perceptions.

Author: Oliveira, Edson, Fernandes, Eduardo, Steinmacher, Igor, Cristo, Marco, Conte, Tayana, and Garcia, Alessandro
Subjects: COMPUTER software development, SOFTWARE productivity, SOFTWARE engineers, REVISION control (Computer science), DECISION making, INFORMATION retrieval
Abstract: Context: Developer productivity is essential to the success of software development organizations. Team leaders use developer productivity information for managing tasks in a software project. Developer productivity metrics can be computed from software repositories data to support leaders' decisions. We can classify these metrics in code-based metrics, which rely on the amount of produced code, and commit-based metrics, which rely on commit activity. Although metrics can assist a leader, organizations usually neglect their usage and end up sticking to the leaders' subjective perceptions only. Objective: We aim to understand whether productivity metrics can complement the leaders' perceptions. We also aim to capture leaders' impressions about relevance and adoption of productivity metrics in practice. Method: This paper presents a multi-case empirical study performed in two organizations active for more than 18 years. Eight leaders of nine projects have ranked the developers of their teams by productivity. We quantitatively assessed the correlation of leaders' rankings versus metric-based rankings. As a complement, we interviewed leaders for qualitatively understanding the leaders' impressions about relevance and adoption of productivity metrics given the computed correlations. Results: Our quantitative data suggest a greater correlation of the leaders' perceptions with code-based metrics when compared to commit-based metrics. Our qualitative data reveal that leaders have positive impressions of code-based metrics and potentially would adopt them. Conclusions: Data triangulation of productivity metrics and leaders' perceptions can strengthen the organization conviction about productive developers and can reveal productive developers not yet perceived by team leaders and probably underestimated in the organization. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

25. Software Quality Assurance in INDIGO-DataCloud Project: a Converging Evolution of Software Engineering Practices to Support European Research e-Infrastructures.

Author: Orviz Fernández, Pablo, David, Mário, Duma, Doina Cristina, Ronchieri, Elisabetta, Gomes, Jorge, and Salomoni, Davide
Abstract: From the advent of Grid technology – as the new paradigm of distributed computing – to the current days of Cloud computing models, the continuous need of new tools and services to match the scientific community requirements has been addressed in Europe through dedicated software development projects for e–Infrastructure creation, operation and management. This work presents the most significant software quality breakthroughs obtained in one of such projects, INDIGO–DataCloud, the main challenges and barriers confronted throughout the lifespan of the project, and how they were partially or totally overcome. The knowledge base established throughout the last 15 years of diverse software development initiatives in Europe for sustaining distributed research e-Infrastructures, supported by the advances in the area of software engineering, definitely contributed to improve the quality and reliability of the software delivered, and consequently, the operational stability of the European e–Infrastructures. INDIGO–DataCloud project is a good evidence of such insights, where, unlike the preceding trend found in past projects, the enforcement of Software Quality Assurance practices has been present since the very early stages of the software lifecycle. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

26. Software defect prediction using over-sampling and feature extraction based on Mahalanobis distance.

Author: NezhadShokouhi, Mohammad Mahdi, Majidi, Mohammad Ali, and Rasoolzadegan, Abbas
Subjects: *SOFTWARE measurement, *MACHINE performance, *COMPUTER software, *GAUSSIAN distribution, *FEATURE extraction, *MACHINE learning
Abstract: As the size of software projects becomes larger, software defect prediction (SDP) will play a key role in allocating testing resources reasonably, reducing testing costs, and speeding up the development process. Most SDP methods have used machine learning techniques based on common software metrics such as Halstead and McCabe's cyclomatic. Datasets produced by these metrics usually do not follow Gaussian distribution, and also, they have overlaps in defect and non-defect classes. In addition, in many of software defect datasets, the number of defective modules (minority class) is considerably less than non-defective modules (majority class). In this situation, the performance of machine learning methods is reduced dramatically. Therefore, we first need to create a balance between minority and majority classes and then transfer the samples into a new space in which pair samples with same class (must-link set) are near to each other as close as possible and pair samples with different classes (cannot-link) stay as far as possible. To achieve the mentioned objectives, in this paper, Mahalanobis distance in two manners will be used. First, the minority class is oversampled based on the Mahalanobis distance such that generated synthetic data are more diverse from other minority data, and minority class distribution is not changed significantly. Second, a feature extraction method based on Mahalanobis distance metric learning is used which try to minimize distances of sample pairs in must-links and maximize the distance of sample pairs in cannot-links. To demonstrate the effectiveness of the proposed method, we performed some experiments on 12 publicly available datasets which are collected NASA repositories and compared its result by some powerful previous methods. The performance is evaluated in F-measure, G-Mean, and Matthews correlation coefficient. Generally, the proposed method has better performance as compared to the mentioned methods. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

27. Developing Sustainable and Energy-Efficient Software Systems

Author: Kruglov, Artem and Succi, Giancarlo
Subjects: Software Engineering, Software Metrics, Software Sustainability, Software Quality Assurance, Green Software, thema EDItEUR::U Computing and Information Technology::UM Computer programming / software engineering::UMZ Software Engineering, thema EDItEUR::K Economics, Finance, Business and Management::KJ Business and Management::KJQ Business mathematics and systems
Abstract: This open access book provides information how to choose and collect the appropriate metrics for a software project in an organization. There are several kinds of metrics, based on the analysis of source code and developed for different programming paradigms such as structured programming and object-oriented programming (OOP). This way, the book follows three main objectives: (i) to identify existing and easily-collectible measures, if possible in the early phases of software development, for predicting and modeling both the traditional attributes of software systems and attributes specifically related to their efficient use of resources, and to create new metrics for such purposes; (ii) to describe ways to collect these measures during the entire lifecycle of a system, using minimally-invasive monitoring of design-time processes, and consolidate them into conceptual frameworks able to support model building by using a variety of approaches, including statistics, data mining and computational intelligence; and (iii) to present models and tools to support design time evolution of systems based on design-time measures and to empirically validate them. The book provides researchers and advanced professionals with methods for understanding the full implications of alternative choices and their relative attractiveness in terms of enhancing system resilience. It also explores the simultaneous use of multiple models that reflect different system interpretations or stakeholder perspectives.
Published: 2023
Full Text: View/download PDF

28. An Empirical Study on Using Class Stability as an Indicator of Class Similarity.

Author: Alshayeb, Mohammad
Subjects: *SOFTWARE measurement, *SOFTWARE maintenance, *COMPUTER software quality control, *MULTIPLE correspondence analysis (Statistics), *RESEMBLANCE (Philosophy)
Abstract: Software maintenance is an important software quality attribute. Many factors affect software maintenance, one of them being code cloning. Code clones are segments of code that are very similar. Software stability tends to measure the unchanged code elements. The objective of this paper is to find whether stability metrics can be used as an indicator of code structural similarity. I perform an empirical study to find the relationship between code similarity and stability at the class level. I also conduct clustering to classify stability and similarity metrics into different related groups. Finally, I perform principal component analysis to determine which class stability metrics have the strongest relationship with class similarity. In addition, I built a prediction model to predict class similarity using class stability metrics. The results show that the four investigated stability metrics have a significant relationship with similarity; however, the class stability metric (CSM) has the strongest correlation with code similarity. The clustering results also reveal that classes with high stability tend to have high similarity. In addition, I found that the CSM and class instability metric (CII) can both reveal 74.023% of class similarity. I conclude that stability metrics can be used as a good indicator of class similarity. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

29. Estimation of maintainability parameters for object-oriented software using hybrid neural network and class level metrics.

Author: Kumar, Lov, Lal, Sangeeta, and Murthy, Lalita Bhanu
Abstract: The various software metrics proposed in the literature can be used to evaluate the quality of software systems written in object-oriented manner. These metrics are broadly categorized into two subcategories i.e., system level software metrics and class level software metrics. In this work, ten different types of class level metrics are considered as an input to develop one model for predicting software maintainability of object-oriented software system. These models are developed using three types of neural networks, i.e., artificial neural network, radial basis function network, and functional link artificial neural network. In this study, a hybrid algorithm based on genetic algorithm (GA) with gradient descent algorithm has been proposed to find optimal weights of these neural networks. Since accuracy of the prediction model is highly dependent on the class level metrics, they are considered as input of the models. So, five different feature selection techniques are used in this study to identify the best set of features with an objective to improve the accuracy of software maintainability prediction model. The effectiveness of these models are evaluated using four evaluation metrics, i.e., MAE, MMRE, RMSE, and SEM. In this work, parallel computing concept has been also considered with an objective to reduce the model training time. The results show that the model developed using the proposed hybrid algorithm based on GA with gradient descent algorithm give better results as compared to the work presented by other authors in literature. The results also show that feature selection techniques obtain better results for predicting maintainability as compared to all metrics. The experimental results show that parallel computing is beneficial in reducing the model training time. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

30. To the attention of mobile software developers: guess what, test your app!

Author: Cruz, Luis, Abreu, Rui, and Lo, David
Subjects: COMPUTER software testing, COMPUTER software development, MOBILE apps, OPEN source software, COMPUTER software quality control
Abstract: Software testing is an important phase in the software development lifecycle because it helps in identifying bugs in a software system before it is shipped into the hand of its end users. There are numerous studies on how developers test general-purpose software applications. The idiosyncrasies of mobile software applications, however, set mobile apps apart from general-purpose systems (e.g., desktop, stand-alone applications, web services). This paper investigates working habits and challenges of mobile software developers with respect to testing. A key finding of our exhaustive study, using 1000 Android apps, demonstrates that mobile apps are still tested in a very ad hoc way, if tested at all. However, we show that, as in other types of software, testing increases the quality of apps (demonstrated in user ratings and number of code issues). Furthermore, we find evidence that tests are essential when it comes to engaging the community to contribute to mobile open source software. We discuss reasons and potential directions to address our findings. Yet another relevant finding of our study is that Continuous Integration and Continuous Deployment (CI/CD) pipelines are rare in the mobile apps world (only 26% of the apps are developed in projects employing CI/CD) – we argue that one of the main reasons is due to the lack of exhaustive and automatic testing. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

31. Deep neural network based hybrid approach for software defect prediction using software metrics.

Author: Manjula, C. and Florence, Lilly
Subjects: *SOFTWARE measurement, *DEEP learning, *COMPUTER software, *GENETIC algorithms, *DATA mining, *MATHEMATICAL optimization
Abstract: In the field of early prediction of software defects, various techniques have been developed such as data mining techniques, machine learning techniques. Still early prediction of defects is a challenging task which needs to be addressed and can be improved by getting higher classification rate of defect prediction. With the aim of addressing this issue, we introduce a hybrid approach by combining genetic algorithm (GA) for feature optimization with deep neural network (DNN) for classification. An improved version of GA is incorporated which includes a new technique for chromosome designing and fitness function computation. DNN technique is also improvised using adaptive auto-encoder which provides better representation of selected software features. The improved efficiency of the proposed hybrid approach due to deployment of optimization technique is demonstrated through case studies. An experimental study is carried out for software defect prediction by considering PROMISE dataset using MATLAB tool. In this study, we have used the proposed novel method for classification and defect prediction. Comparative study shows that the proposed approach of prediction of software defects performs better when compared with other techniques where 97.82% accuracy is obtained for KC1 dataset, 97.59% accuracy is obtained for CM1 dataset, 97.96% accuracy is obtained for PC3 dataset and 98.00% accuracy is obtained for PC4 dataset. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

32. Applying learning-based methods for recognizing design patterns.

Author: Dwivedi, Ashish Kumar, Tirkey, Anand, and Rath, Santanu Kumar
Abstract: Recognizing design patterns in source code helps in improving the aspect of reusability and maintainability that play an essential role during analysis and design phases of software development process. Software patterns provide design-level documents, which are applied for the recurring design issues. Analysis of design patterns is often carried out by using forward engineering as well as reverse engineering. In this study, a reverse engineering approach has been applied for recognizing design patterns. The study is comprised of two phases such as preparation of requisite dataset based on object-oriented software metrics and recognition of design patterns. The first phase, i.e., dataset preparation, is carried out by various object-oriented metrics. Design pattern recognition is performed by using learning-based algorithms such as artificial neural network and logistic regression. The presented method is validated by using three case studies such as JRefactory, JUnit and Quaqua. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

33. Measuring software stability based on complex networks in software.

Author: Pan, Weifeng and Chai, Chunlai
Subjects: *COMPUTER software quality control, *COMPUTER software, *SOFTWARE maintenance, *SOFTWARE measurement, *SCALABILITY
Abstract: Software maintenance is regarded as an activity of high cost. Developing meaningful metrics to assess the quality characteristics of software has become one of the most effective ways to reduce the cost. In this paper, we propose metrics to quantify the software stability from a complex network perspective. First, the topological structure of software at the class level is represented by a Class Coupling Network (CCN). Second, based on the CCN, we further propose a Node Influence Network (NIN) which considers both the directed and indirected (transitive) coupling strength between classes. Finally, based on NIN, we propose a metric to quantify the class stability and further propose a metric to quantify the stability of software as a whole. The proposed metrics are validated theoretically using widely accepted Weyuker's criteria and empirically using Java programs. The theoretical evaluation shows the proposed metrics satisfy most of Weyuker's properties, and the empirical evaluation shows the effectiveness of our proposed metrics as indicators of the external software qualities such as scalability and change proneness. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

34. A metrics suite for UML model stability.

Author: AbuHassan, Amjad and Alshayeb, Mohammad
Subjects: *SOFTWARE measurement, *UNIFIED modeling language, *COMPUTER software development, *COMPUTER architecture, *SOURCE code
Abstract: Software metrics have become an essential part of software development because of their importance in estimating cost, effort, and time during the development phase. Many metrics have been proposed to assess different software quality attributes, including stability. A number of software stability metrics have been proposed at the class, architecture, and system levels. However, these metrics typically target the source code. This paper proposes a software stability metrics suite at the model level for three UML diagrams: class, use case, and sequence. These three diagrams represent the most common diagrams in the three UML views: structural, functional, and behavioral. We introduce a client-master assessment approach to avoid measurement duplication. We also theoretically and empirically validate the proposed metrics suite. We also provide examples to demonstrate the use of the proposed metrics and their application as indicators of software stability. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

35. A study on software fault prediction techniques.

Author: Rathore, Santosh S. and Kumar, Sandeep
Subjects: DEBUGGING, MATHEMATICAL optimization, COMPUTER software testing, DIGITAL libraries, SOFTWARE measurement
Abstract: Software fault prediction aims to identify fault-prone software modules by using some underlying properties of the software project before the actual testing process begins. It helps in obtaining desired software quality with optimized cost and effort. Initially, this paper provides an overview of the software fault prediction process. Next, different dimensions of software fault prediction process are explored and discussed. This review aims to help with the understanding of various elements associated with fault prediction process and to explore various issues involved in the software fault prediction. We search through various digital libraries and identify all the relevant papers published since 1993. The review of these papers are grouped into three classes: software metrics, fault prediction techniques, and data quality issues. For each of the class, taxonomical classification of different techniques and our observations have also been presented. The review and summarization in the tabular form are also given. At the end of the paper, the statistical analysis, observations, challenges, and future directions of software fault prediction have been discussed. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

36. Prediction of software fault-prone classes using an unsupervised hybrid SOM algorithm.

Author: Viji, C., Rajkumar, N., and Duraisamy, S.
Subjects: *SELF-organizing maps, *COMPUTER software quality control, *SOFTWARE measurement, *SOFTWARE engineering, *SYSTEMS software, *COMPUTER software
Abstract: In software engineering fault proneness prediction is one of the important fields for quality measurement using multiple code metrics. The metrics thresholds are very practical in measuring the code quality for fault proneness prediction. It helps to improvise the software quality in short time with very low cost. Many researchers are in the race to develop a measuring attribute for the software quality using various methodologies. Currently so many fault proneness prediction models are available. Among that most of the methods are used to identify the faults either by data history or by special supervising algorithms. In most of the real time cases the fault data bases may not be available so that the process becomes tedious. This article proposes a hybrid model for identifying the faults in the software models and also we proposed coupling model along with the algorithm so that the metrics are used to identify the faults and the coupling model couples the metrics and the faults for the developed system software. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

37. Software defect prediction techniques using metrics based on neural network classifier.

Author: Jayanthi, R. and Florence, Lilly
Subjects: *SOFTWARE measurement, *COMPUTER software quality control, *COMPUTER software industry, *MACHINE learning, *COMPUTER software, *DATA reduction
Abstract: Software industries strive for software quality improvement by consistent bug prediction, bug removal and prediction of fault-prone module. This area has attracted researchers due to its significant involvement in software industries. Various techniques have been presented for software defect prediction. Recent researches have recommended data-mining using machine learning as an important paradigm for software bug prediction. state-of-art software defect prediction task suffer from various issues such as classification accuracy. However, software defect datasets are imbalanced in nature and known fault prone due to its huge dimension. To address this issue, here we present a combined approach for software defect prediction and prediction of software bugs. Proposed approach delivers a concept of feature reduction and artificial intelligence where feature reduction is carried out by well-known principle component analysis (PCA) scheme which is further improved by incorporating maximum-likelihood estimation for error reduction in PCA data reconstruction. Finally, neural network based classification technique is applied which shows prediction results. A framework is formulated and implemented on NASA software dataset where four datasets i.e., KC1, PC3, PC4 and JM1 are considered for performance analysis using MATLAB simulation tool. An extensive experimental study is performed where confusion, precision, recall, classification accuracy etc. parameters are computed and compared with existing software defect prediction techniques. Experimental study shows that proposed approach can provide better performance for software defect prediction. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

38. Supporting the analyzability of architectural component models - empirical findings and tool support.

Author: Stevanetic, Srdjan and Zdun, Uwe
Abstract: This article discusses the understandability of component models that are frequently used as central views in architectural descriptions of software systems. We empirically examine how different component level metrics and the participants’ experience and expertise can be used to predict the understandability of those models. In addition, we develop a tool that supports applying the obtained empirical findings in practice. Our results show that the prediction models have a large effect size, which means that their prediction strength is of high practical significance. The participants’ experience plays an important role in the prediction but the obtained models are not as accurate as the models that use the component level metrics. The developed tools combine the DSL-based architecture abstraction approach with the obtained empirical findings. While the DSL-based architecture abstraction approach enables software architects to keep source code and architecture consistent, the metrics extensions enable them, while working with the DSL, to continuously judge and improve the analyzability of architectural component models based on the understandability of their individual components they create with the DSL. Provided metrics extensions can also help in assessing how much each architectural rule used to specify the DSL affects the understandability of a component which enables for instance finding the rules that contribute the most to a limited understandability. Finally, our approach supports change impact analysis, i.e., the identification of changes that affect different analyzability levels of the component models. We studied the applicability of our approach in a case study of an existing open source system. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

39. Developing Sustainable and Energy-Efficient Software Systems.

Author: Kruglov, Artem and Succi, Giancarlo
Subjects: Business mathematics & systems, Software Engineering, Green Software, Software Metrics, Software Quality Assurance, Software Sustainability
Abstract: Summary: This open access book provides information how to choose and collect the appropriate metrics for a software project in an organization. There are several kinds of metrics, based on the analysis of source code and developed for different programming paradigms such as structured programming and object-oriented programming (OOP). This way, the book follows three main objectives: (i) to identify existing and easily-collectible measures, if possible in the early phases of software development, for predicting and modeling both the traditional attributes of software systems and attributes specifically related to their efficient use of resources, and to create new metrics for such purposes; (ii) to describe ways to collect these measures during the entire lifecycle of a system, using minimally-invasive monitoring of design-time processes, and consolidate them into conceptual frameworks able to support model building by using a variety of approaches, including statistics, data mining and computational intelligence; and (iii) to present models and tools to support design time evolution of systems based on design-time measures and to empirically validate them. The book provides researchers and advanced professionals with methods for understanding the full implications of alternative choices and their relative attractiveness in terms of enhancing system resilience. It also explores the simultaneous use of multiple models that reflect different system interpretations or stakeholder perspectives.

40. Evolutionary Computation-Based Techniques Over Multiple Data Sets: An Empirical Assessment.

Author: Khari, Manju and Kumar, Prabhat
Subjects: *EVOLUTIONARY computation, *DARWINIAN medicine, *COMPUTER software
Abstract: In the realm of software testing various organizations wish to predict the faults in their software systems prior to their deployment. This improves the delivered quality and also reduces the maintenance effort. A multitude of software metrics and statistical models have been developed to solve this problem and one such method is called defect prediction. Defect prediction is the process of identifying the defects in the software program prior to its deployment. In recent times, a class of learners called evolutionary computation (EC) techniques has emerged. These EC techniques apply the Darwinian principle of ‘survival of the fittest’. This study performs an empirical assessment of the performance of various EC techniques in the prediction of software defects over multiple data sets. An empirical assessment compares and assesses the performance capability of 16 EC techniques for evaluating the relationship between object-oriented metrics and defect prediction. The developed models are validated using 7 data sets obtained from open source software systems developed by the Software Foundation. On investigating their predictive capabilities and comparative performance, it was found that a majority of EC techniques proved to be highly effective. DTG (a hybridized algorithm) was observed to be the best performing technique. The work done in the current study shows that EC techniques are very effective and can be highly beneficial to testers in the realm of defect prediction in the future. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

41. A bayesian belief network based model for predicting software faults in early phase of software development process.

Author: Chatterjee, Subhashis and Maji, Bappa
Subjects: BAYESIAN analysis, STATISTICAL decision making, COMPUTER software development, PREDICTION models, PROBABILITY theory
Abstract: It is always better to have an idea about the future situation of a present work. Prediction of software faults in the early phase of software development life cycle can facilitate to the software personnel to achieve their desired software product. Early prediction is of great importance for optimizing the development cost of a software project. The present study proposes a methodology based on Bayesian belief network, developed to predict total number of faults and to reach a target value of total number of faults during early development phase of software lifecycle. The model has been carried out using the information from similar or earlier version software projects, domain expert’s opinion and the software metrics. Interval type-2 fuzzy logic has been applied for obtaining the conditional probability values in the node probability tables of the belief network. The output pattern corresponding to the total number of faults has been identified by artificial neural network using the input pattern from similar or earlier project data. The proposed Bayesian framework facilitates software personnel to gain the required information about software metrics at early phase for achieving targeted number of software faults. The proposed model has been applied on twenty six software project data. Results have been validated by different statistical comparison criterion. The performance of the proposed approach has been compared with some existing early fault prediction models. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

42. A model for estimating change propagation in software.

Author: M. Ferreira, Kecia A., S. Bigonha, Mariza A., S. Bigonha, Roberto, de Lima, Bernardo N., Gomes, Bárbara M., and O. Mendes, Luiz Felipe
Subjects: COMPUTER software development, STOCHASTIC models, COMPUTER systems, PREDICTION models, COMPUTER programming
Abstract: A major issue in software maintenance is change propagation. A software engineer should be able to assess the impact of a change in a software system, so that the effort to accomplish the maintenance may be properly estimated. We define a novel model, named K3B, for estimating change propagation impact. The model aims to predict how far a set of changes will propagate throughout the system. K3B is a stochastic model that has input parameters about the system and the number of modules which will be initially changed. K3B returns the estimated number of change steps, considering that a module may be changed more than once during a modification process. We provide the implementation of K3B for object-oriented programs. We compare our implementation with data from an artificial scenario, given by simulation, as well as with data from a real scenario, given by historical data. We found strong correlation between the results given by K3B and the results observed in the simulation, as well as with historical data of change propagation. K3B may be used for comparing software systems from the viewpoint of change impact. The model may aid software engineers in allocating proper resources to the maintenance tasks. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

43. Early software reliability analysis using reliability relevant software metrics.

Author: Yadav, Harikesh and Yadav, Dilip
Abstract: The early software reliability analysis is very useful for improving the quality of software at reduced testing effort. Software defect density indicator predicted in the early phases (requirement analysis, design and coding phases) provides an opportunity for the early identification of cost overrun, software development process issues and optimal development strategies. Failure data is not available in the early phases of the software development life cycle (SDLC). However, qualitative values of software metrics are available in the early phases of SDLC. Therefore, in this paper, a model is proposed to predict the software defect density indicator of early phases of SDLC using fuzzy logic and the reliability relevant software metrics of early artifacts. The proposed model is applied on twenty real software projects. It is observed that the requirement analysis phase defect density indicator value is relatively greater than that of the design and coding artifacts. The model is validated with the existing literature. Validation result is satisfactory. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

44. Software defects estimation using metrics of early phases of software development life cycle.

Author: Kumar, Chandan and Yadav, Dilip
Abstract: An estimation of software defects can be obtained in the later phase of software testing. However, with the aim of cost-effectiveness and timely management of resources, the software defects estimation in the early phases of software development life cycle (SDLC) is one of the major research areas. In this paper, a software defect estimation model is proposed using Bayesian belief network (BBN) and reliability relevant metrics of early phases of SDLC (e.g., requirement analysis, design and coding phases). The causal relationship of software metrics is modeled using BBN. The qualitative value of software metrics and expert assessment of software defects is used for developing the proposed model. The defects estimation accuracy of the proposed model is examined using qualitative data set of ten real software projects. The defects estimation results are compared with the existing model and found more accurate. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

45. Data Transformation in Cross-project Defect Prediction.

Author: Zhang, Feng, Keivanloo, Iman, and Zou, Ying
Abstract: Software metrics rarely follow a normal distribution. Therefore, software metrics are usually transformed prior to building a defect prediction model. To the best of our knowledge, the impact that the transformation has on cross-project defect prediction models has not been thoroughly explored. A cross-project model is built from one project and applied on another project. In this study, we investigate if cross-project defect prediction is affected by applying different transformations (i.e., log and rank transformations, as well as the Box-Cox transformation). The Box-Cox transformation subsumes log and other power transformations (e.g., square root), but has not been studied in the defect prediction literature. We propose an approach, namely Multiple Transformations (MT), to utilize multiple transformations for cross-project defect prediction. We further propose an enhanced approach MT+ to use the parameter of the Box-Cox transformation to determine the most appropriate training project for each target project. Our experiments are conducted upon three publicly available data sets (i.e., AEEEM, ReLink, and PROMISE). Comparing to the random forest model built solely using the log transformation, our MT+ approach improves the F-measure by 7, 59 and 43% for the three data sets, respectively. As a summary, our major contributions are three-fold: 1) conduct an empirical study on the impact that data transformation has on cross-project defect prediction models; 2) propose an approach to utilize the various information retained by applying different transformation methods; and 3) propose an unsupervised approach to select the most appropriate training project for each target project. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

46. A machine and deep learning analysis among SonarQube rules, product, and process metrics for fault prediction

Author: Lomio, F. (Francesco), Moreschini, S. (Sergio), Lenarduzzi, V. (Valentina), Lomio, F. (Francesco), Moreschini, S. (Sergio), and Lenarduzzi, V. (Valentina)
Abstract: Background: Developers spend more time fixing bugs refactoring the code to increase the maintainability than developing new features. Researchers investigated the code quality impact on fault-proneness, focusing on code smells and code metrics. Objective: We aim at advancing fault-inducing commit prediction using different variables, such as SonarQube rules, product, process metrics, and adopting different techniques. Methods: We designed and conducted an empirical study among 29 Java projects analyzed with SonarQube and SZZ algorithm to identify fault-inducing and fault-fixing commits, computing different product and process metrics. Moreover, we investigated fault-proneness using different Machine and Deep Learning models. Results: We analyzed 58,125 commits containing 33,865 faults and infected by more than 174 SonarQube rules violated 1.8M times, on which 48 software product and process metrics were calculated. Results clearly identified a set of features that provided a highly accurate fault prediction (more than 95% AUC). Regarding the performance of the classifiers, Deep Learning provided a higher accuracy compared with Machine Learning models. Conclusions: Future works might investigate whether other static analysis tools, such as FindBugs or Checkstyle, can provide similar or different results. Moreover, researchers might consider the adoption of time series analysis and anomaly detection techniques.
Published: 2022

47. Analysis of high structural class coupling in object-oriented software systems.

Author: Savić, Miloš, Ivanović, Mirjana, and Radovanović, Miloš
Subjects: *NETWORK analysis (Planning), *SOFTWARE measurement, *COMPUTER software development, *POWER law (Mathematics), *SET theory, *WIRELESS sensor nodes
Abstract: Understanding coupling between classes in object-oriented (OO) software systems is useful for a variety of software development and maintenance activities. In this paper we propose a novel, network-based methodology to analyze high structural class coupling in OO software systems. The proposed methodology is based on statistically robust structural analysis of class collaboration networks whose nodes are enriched with both software metrics and domain-independent metrics used in analysis of complex networks. To demonstrate the usefulness of the methodology we analyze five open-source, large-scale software systems written in Java. Contrary to frequently reported findings, the obtained results indicate that high structural class coupling in real software systems cannot be accurately modeled by power-law distributions. Our analysis also shows that highly-coupled classes tend to be significantly more voluminous and functionally important compared to loosely coupled classes, and do not tend to be localized in class inheritance hierarchies. Finally, in four out of five analyzed systems highly coupled classes tend to have drastically higher afferent than efferent coupling. This implies that the existence of high class coupling in an OO software system would rather indicate negative aspects of extensive internal class reuse than negative aspects of extensive internal class aggregation. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

48. MAQM: a generic object-oriented framework to build quality models for Web-based applications.

Author: Kumar, Nimish, Dadhich, Reena, and Shastri, Aditya
Abstract: Software measures play an eminent role in the evaluation of quality of a Web-based application. Increased dependency on Web-based applications enhanced the need to evaluate Web quality parameters. A poor quality Web application fails to fulfill customer's implicit and explicit requirements resulting in poor customer acceptance. Many frameworks have so far been proposed for measurement of quality parameters of Web-based applications but they lack in one way or other. Most of the existing frameworks deal with narrow range of quality characteristics or is limited to a specific Web-based application perspective. Based on the limitations of the current frameworks, this paper proposes a new generic conceptual object-oriented framework multi-attribute quality model which scientifically categorize quality characteristics and its sub-characteristics based on different perspectives and usage of Web applications. The main concept of the newly proposed framework is to analyze and expand the previously established quality characteristics in Web-based applications. The approach is chosen due to the diverse and impulsive nature of the Web applications. For designing the framework ISO/IEC 9126 framework is used as reference. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

49. The application of ROC analysis in threshold identification, data imbalance and metrics selection for software fault prediction.

Author: Shatnawi, Raed
Abstract: Software engineers have limited resources and need metrics analysis tools to investigate software quality such as fault-proneness of modules. There are a large number of software metrics available to investigate quality. However, not all metrics are strongly correlated with faults. In addition, software fault data are imbalanced and affect quality assessment tools such as fault prediction or threshold values that are used to identify risky modules. Software quality is investigated for three purposes. First, the receiver operating characteristics (ROC) analysis is used to identify threshold values to identify risky modules. Second, the ROC analysis is investigated for imbalanced data. Third, the ROC analysis is considered for feature selection. This work validated the use of ROC to identify thresholds for four metrics (WMC, CBO, RFC and LCOM). The ROC results after sampling the data are not significantly different from before sampling. The ROC analysis selects the same metrics (WMC, CBO and RFC) in most datasets, while other techniques have a large variation in selecting metrics. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

50. Measuring Object-Oriented Class Cohesion Based on Complex Networks.

Author: Gu, Aihua, Zhou, Xiaofeng, Li, Zonghua, Li, Qinfeng, and Li, Lu
Subjects: *COHESION, *MATERIAL plasticity
Abstract: Class cohesion has an immediate impact on maintainability, modifiability and understandability of the software. Here, a new metric of cohesion based on complex networks (CBCN) for measuring connectivity of class members was developed mainly relying on calculating class average clustering coefficient from graphs representing connectivity patterns of the various class members. In addition, the CBCN metric was assessed with theoretical validation according to four properties (nonnegativity and normalization, null and maximum values, monotonicity, cohesive modules) of the class cohesion theory. Based on data comparison with existing seventeen typical class cohesion metrics of class cohesion for a system, the CBCN metric was superior to others. Applying the CBCN metric to three open source software systems to calculate class average clustering coefficients, we found that understanding, modification and maintenance of classes in an open software system could be likely less difficult compared with those of others. Three open software systems have power-law distributions for the class average clustering coefficient, which makes possible the further understanding of the cohesion metric based on complex networks. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

139 results on '"Software Metrics"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources