Back to Search
Start Over
A Fast Clustering Algorithm for Modularization of Large-Scale Software Systems
- Source :
- IEEE Transactions on Software Engineering. 48:1451-1462
- Publication Year :
- 2022
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2022.
-
Abstract
- A software system evolves overtime to meet the needs of users. Understanding a program is the most important step to apply new requirements. Clustering techniques by dividing a program into small and meaningful parts make it possible to understand the program. In general, clustering algorithms are classified into two categories: hierarchical and non-hierarchical algorithms (such as search-based approaches). While clustering problems generally tend to be NP-hard, search-based algorithms produce acceptable clustering, but have time and space constraints and hence they are inefficient in large-scale software systems. Most algorithms currently used in software clustering fields do not scale well when applied to large and very large applications. In this paper, we present a new and fast clustering algorithm, named FCA, that can overcome space and time constraints of existing algorithms by performing operations on the dependency matrix and extracting other matrices based on a set of features. The experimental results on ten small-sized applications, ten folders with different functionalities from Mozilla Firefox, a large-sized application (namely ITK), and a very large-sized application (namely Chromium) demonstrate that the proposed algorithm achieves higher quality modularization compared with hierarchical algorithms. It can also compete with search-based algorithms and a clustering algorithm based on subsystem patterns. But the running time of the proposed algorithm is much shorter than that of the hierarchical and non-hierarchical algorithms. The source code of the proposed algorithm can be accessed at https://github.com/SoftwareMaintenanceLab.
- Subjects :
- Source code
business.industry
Computer science
media_common.quotation_subject
020207 software engineering
Scale (descriptive set theory)
02 engineering and technology
Design structure matrix
computer.software_genre
Set (abstract data type)
Software
Modular programming
0202 electrical engineering, electronic engineering, information engineering
Software system
Data mining
business
Cluster analysis
computer
media_common
Subjects
Details
- ISSN :
- 23263881 and 00985589
- Volume :
- 48
- Database :
- OpenAIRE
- Journal :
- IEEE Transactions on Software Engineering
- Accession number :
- edsair.doi...........e8d961b06dabc79b9c233dd8ddf9482d