Back to Search Start Over

Is Error-Based Pruning Redeemable?

Authors :
Kevin W. Bowyer
Lawrence O. Hall
Steven A. Eschrich
Robert E. Banfield
Richard Collins
Source :
International Journal on Artificial Intelligence Tools. 12:249-264
Publication Year :
2003
Publisher :
World Scientific Pub Co Pte Lt, 2003.

Abstract

Error based pruning can be used to prune a decision tree and it does not require the use of validation data. It is implemented in the widely used C4.5 decision tree software. It uses a parameter, the certainty factor, that affects the size of the pruned tree. Several researchers have compared error based pruning with other approaches, and have shown results that suggest that error based pruning results in larger trees that give no increase in accuracy. They further suggest that as more data is added to the training set, the tree size after applying error based pruning continues to grow even though there is no increase in accuracy. It appears that these results were obtained with the default certainty factor value. Here, we show that varying the certainty factor allows significantly smaller trees to be obtained with minimal or no accuracy loss. Also, the growth of tree size with added data can be halted with an appropriate choice of certainty factor. Methods of determining the certainty factor are discussed for both small and large data sets. Experimental results support the conclusion that error based pruning can be used to produce appropriately sized trees with good accuracy when compared with reduced error pruning.

Details

ISSN :
17936349 and 02182130
Volume :
12
Database :
OpenAIRE
Journal :
International Journal on Artificial Intelligence Tools
Accession number :
edsair.doi...........904cd8f8753950e71731a4b2f4d2bea2
Full Text :
https://doi.org/10.1142/s0218213003001228