1. 带有超长方体约束的少数类样本生成机制.
- Author
-
贺作伟, 陶佳晴, 冷强奎, 翟军昌, and 孟祥福
- Subjects
- *
INTERPOLATION , *PROBLEM solving , *MINORITIES - Abstract
Synthetic minority oversampling technology(SMOTE) is one of the effective methods to solve the classimbalanced problem. However, the linear interpolation mechanism of SMOTE restricts the synthesized samples to the connecting line of the original samples, resulting in a lack of diversity for new samples, and may generate noisy samples when this line passes through the majority class region. In response to the above issues, this paper proposed a generation mechanism for minority samples with hypercuboid constraints. This mechanism constructed a hypercuboid as the generation region of new samples instead of linear interpolation, thereby increasing the variability between the synthesized samples and the original samples. Then, it detected whether there were majority samples in the hypercuboid to determine whether to adjust the hypercuboid, which aimed at preventing the new samples into the region of the majority class. This paper integrated the proposed mechanism into three oversampling methods, i. e., SMOTE, Borderline-SMOTE, and ADASYN, by using it to replace linear interpolation, and then experimentally evaluated the integrated methods on 11 benchmark datasets from KEEL. The results showed that compared to the original methods, the integrated methods could help the classifier to obtain higher F1 and comparable G-mean. It verifies that the hypercuboid generation mechanism can significantly improve the classifier’s ability to recognize minority samples, and meanwhile the majority samples are also taken into account. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF