1. Using discriminative feature in software entities for relevance identification of code changes.
- Author
-
Huang, Yuan, Chen, Xiangping, Liu, Zhiyong, Luo, Xiaonan, and Zheng, Zibin
- Subjects
COMPUTER software development ,SOFTWARE engineering ,MACHINE learning ,COMPUTER programming ,BINARY codes - Abstract
Developers often bundle unrelated changes (eg, bug fix and feature addition) in a single commit and then submit a 'poor cohesive' commit to version control system. Such a commit consists of multiple independent code changes and makes review of code changes harder. If the code changes before commit can be identified as related and unrelated ones, the 'cohesiveness' of a commit can be guaranteed. Inspired by the effectiveness of machine learning techniques in classification field, we model the relevance identification of code changes as a binary classification problem (ie, related and unrelated changes) and propose discriminative feature in software entities to characterize the relevance of code changes. In particular, to quantify the discriminative feature, 21 coupling rules and 4 cochanged type relationships are elaborately extracted from software entities to construct related changes vector ( RCV). Twenty-one coupling rules at granularities of class, attribute, and method can capture the relevance of code changes from structural coupling dimension, and 4 cochanged type relationships are defined to capture the change type combinations of software entities that may cause related changes. Based on RCV, machine learning algorithms are applied to identify the relevance of code changes. The experiment results show that probabilistic neural network and general regression neural network provide statistically significant improvements in accuracy of relevance identification of code changes over the other 4 machine learning algorithms. Related changes vector with 72 dimensions ( R C V
72 ) outperforms other 2 RCVs with less dimensions. [ABSTRACT FROM AUTHOR]- Published
- 2017
- Full Text
- View/download PDF