Back to Search Start Over

An Efficient Decision Tree for Imbalance data learning using Confiscate and Substitute Technique

Authors :
E. Ilavarasan
Salina Adinarayana
Source :
Materials Today: Proceedings. 5:680-687
Publication Year :
2018
Publisher :
Elsevier BV, 2018.

Abstract

Data mining and knowledge discovery is the process of discovering knowledge from the real world datasets. One of the limitations of the real world datasets is the existence of contamination in the dataset. The existing algorithms performance will degrade due to the contamination in the real world datasets in the form of noisy and missing values. In this paper, we propose a novel algorithm dubbed as Confiscate and Substitute Imbalance Data Learning (CSIDL) for better knowledge discovery from real world datasets. The process of confiscate is implemented in the majority subset for the removal of noisy, border line and missing instances and substitute of missing instances is done in the minority subset for improving the strength of the dataset. Experimental comparisons are done on six real world dataset with bench mark traditional algorithms. The results suggest that the proposed CSIDL algorithm performed better than the compared algorithms in terms of Accuracy, AUC, Precision and F-measure.

Details

ISSN :
22147853
Volume :
5
Database :
OpenAIRE
Journal :
Materials Today: Proceedings
Accession number :
edsair.doi...........39e510b106b38924adf8798cc1cd8d23
Full Text :
https://doi.org/10.1016/j.matpr.2017.11.133