Back to Search
Start Over
A novel pruning approach using expert knowledge for data-specific pruning
- Source :
- Engineering with Computers. 28:21-30
- Publication Year :
- 2011
- Publisher :
- Springer Science and Business Media LLC, 2011.
-
Abstract
- Classification is an important data mining task that discovers hidden knowledge from the labeled datasets. Most approaches to pruning assume that all dataset are equally uniform and equally important, so they apply equal pruning to all the datasets. However, in real-world classification problems, all the datasets are not equal and considering equal pruning rate during pruning tends to generate a decision tree with large size and high misclassification rate. We approach the problem by first investigating the properties of each dataset and then deriving data-specific pruning value using expert knowledge which is used to design pruning techniques to prune decision trees close to perfection. An efficient pruning algorithm dubbed EKBP is proposed and is very general as we are free to use any learning algorithm as the base classifier. We have implemented our proposed solution and experimentally verified its effectiveness with forty real world benchmark dataset from UCI machine learning repository. In all these experiments, the proposed approach shows it can dramatically reduce the tree size while enhancing or retaining the level of accuracy.
- Subjects :
- business.industry
Computer science
General Engineering
Decision tree
computer.software_genre
Machine learning
Computer Science Applications
ComputingMethodologies_PATTERNRECOGNITION
Principal variation search
Modeling and Simulation
Null-move heuristic
Artificial intelligence
Data mining
Pruning algorithm
business
Classifier (UML)
computer
Computer Science::Databases
Software
Large size
Killer heuristic
Subjects
Details
- ISSN :
- 14355663 and 01770667
- Volume :
- 28
- Database :
- OpenAIRE
- Journal :
- Engineering with Computers
- Accession number :
- edsair.doi...........4556e99305aff29eaebda42adf47ef45
- Full Text :
- https://doi.org/10.1007/s00366-011-0214-1