Back to Search Start Over

基于 Tomek 链的边界少数类样本合成过采样方法.

Authors :
陶佳晴
贺作伟
冷强奎
翟军昌
孟祥福
Source :
Application Research of Computers / Jisuanji Yingyong Yanjiu. Feb2023, Vol. 40 Issue 2, p463-469. 7p.
Publication Year :
2023

Abstract

In a class-imbalanced dataset, since the samples close to the class boundary are more likely to be misclassified, it is of great significance to accurately identify boundary samples for classification. Existing methods usually use K-nearest neighbors to identify boundary samples, but the accuracy needs to be improved. To address the above problem, this paper proposed a synthetic oversampling method for boundary minority samples based on Tomek links. This method first found inter-class samples with the nearest distance to form Tomek links. Then, it identifies those minority samples located at the inter-class boundary according to Tomek links. Next, it used the linear interpolation mechanism in synthetic minority oversampling technology (SMOTE) to perform oversampling between the boundary samples and their minority neighbors, thereby achieving the balance of the datasets. The comparison experiment with eight sampling algorithms shows that the proposed method can obtain higher G-mean and F, values on most of the datasets. [ABSTRACT FROM AUTHOR]

Details

Language :
Chinese
ISSN :
10013695
Volume :
40
Issue :
2
Database :
Academic Search Index
Journal :
Application Research of Computers / Jisuanji Yingyong Yanjiu
Publication Type :
Academic Journal
Accession number :
162018068
Full Text :
https://doi.org/10.19734/j.issn.1001-3695.2022.07.0341