Back to Search Start Over

OBMI: oversampling borderline minority instances by a two-stage Tomek link-finding procedure for class imbalance problem

Authors :
Qiangkui Leng
Jiamei Guo
Jiaqing Tao
Xiangfu Meng
Changzhong Wang
Source :
Complex & Intelligent Systems, Vol 10, Iss 4, Pp 4775-4792 (2024)
Publication Year :
2024
Publisher :
Springer, 2024.

Abstract

Abstract Mitigating the impact of class imbalance datasets on classifiers poses a challenge to the machine learning community. Conventional classifiers do not perform well as they are habitually biased toward the majority class. Among existing solutions, the synthetic minority oversampling technique (SMOTE) has shown great potential, aiming to improve the dataset rather than the classifier. However, SMOTE still needs improvement because of its equal oversampling to each minority instance. Based on the consensus that instances far from the borderline contribute less to classification, a refined method for oversampling borderline minority instances (OBMI) is proposed in this paper using a two-stage Tomek link-finding procedure. In the oversampling stage, the pairs of between-class instances nearest to each other are first found to form Tomek links. Then, these minority instances in Tomek links are extracted as base instances. Finally, new minority instances are generated, each of which is linearly interpolated between a base instance and one minority neighbor of the base instance. To address the overlap caused by oversampling, in the cleaning stage, Tomek links are employed again to remove the borderline instances from both classes. The OBMI is compared with ten baseline methods on 17 benchmark datasets. The results show that it performs better on most of the selected datasets in terms of the F1-score and G-mean. Statistical analysis also indicates its higher-level Friedman ranking.

Details

Language :
English
ISSN :
21994536 and 21986053
Volume :
10
Issue :
4
Database :
Directory of Open Access Journals
Journal :
Complex & Intelligent Systems
Publication Type :
Academic Journal
Accession number :
edsdoj.fef090ef454c4ffe8b92ade1c3332771
Document Type :
article
Full Text :
https://doi.org/10.1007/s40747-024-01399-y