1. ESMOTE: an overproduce-and-choose synthetic examples generation strategy based on evolutionary computation.
- Author
-
Zhang, Zhong-Liang, Peng, Rui-Rui, Ruan, Yuan-Peng, Wu, Jian, and Luo, Xing-Gang
- Subjects
LEARNING problems ,DATA mining ,MACHINE learning ,OVERPRODUCTION ,INTERPOLATION ,EVOLUTIONARY computation ,EVOLUTIONARY algorithms - Abstract
The class imbalance learning problem is an important topic that has attracted considerable attention in machine learning and data mining. The most common method of addressing imbalanced datasets is the synthetic minority oversampling technique (SMOTE). However, the SMOTE and its variants suffer from the noise derived from the interpolation of synthetic examples. In this paper, an overproduce-and-choose strategy, which is divided into the overproduction and selection phases, is proposed to generate an appropriate set of synthetic examples for imbalance learning problems. In the overproduction phase, a new interpolation mechanism is developed to produce numerous synthetic examples, while in the selection phase, the synthetic examples that are beneficial to the classification task are selected by using instance selection based on evolutionary computation. Experiments are conducted on a large number of datasets selected from the real-world applications. The experimental results demonstrate that the proposed method is significantly better than SMOTE and its well-known variants in terms of several metrics, including G-mean (GM) and area under the curve. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF