Back to Search
Start Over
Adapted pruning scheme for the framework of imbalanced data-sets
- Source :
- KES
- Publication Year :
- 2017
- Publisher :
- Elsevier BV, 2017.
-
Abstract
- Learning from imbalanced data is attracting an increasing interest by the machine learning community. This is mainly due to the high number of real applications that are affected by this situation. The adaptation of the standard decision trees to deal with imbalanced data represents one of the important number of approaches that have been developed to address this problem. This adaptation has been proposed under three different perspectives: splitting criterion, assignment rule and pruning. In this paper, we focus our attention to the pruning of decision trees. We propose an adaptation of the standard pruning algorithm MCCP to address the skewed-data problem. Our contribution affects two levels: adaption of the metric used in selecting nodes to be firstly pruned and change of the evaluation measure used in selecting the best decision-tree through the pruning set. Our goal is to show that, contrary to the popular belief in the literature enquiring into the uselessness of decision tree pruning, an adaptive pruning technique for imbalanced situations is more efficient and more accurate towards the minority class. A total of twelve binary class data-sets having different imbalance ratio are used to test the performance of the proposed method. Experimental results show that the proposed post-pruning approach can increase the performance of imbalanced decision trees in terms of evaluation measures that are recent and appropriate for the context of imbalanced classification.
- Subjects :
- business.industry
Computer science
Decision tree
Context (language use)
02 engineering and technology
computer.software_genre
Machine learning
Class (biology)
Set (abstract data type)
Principal variation search
020204 information systems
Metric (mathematics)
0202 electrical engineering, electronic engineering, information engineering
General Earth and Planetary Sciences
020201 artificial intelligence & image processing
Data mining
Artificial intelligence
Pruning (decision trees)
Adaptation (computer science)
business
computer
General Environmental Science
Subjects
Details
- ISSN :
- 18770509
- Volume :
- 112
- Database :
- OpenAIRE
- Journal :
- Procedia Computer Science
- Accession number :
- edsair.doi...........1506eeb47a6836489a49deb90d79f2d7
- Full Text :
- https://doi.org/10.1016/j.procs.2017.08.060