Business Intelligence

Study of Various Decision Tree Pruning Methods with Their Empirical Comparison in WEKA

Free registration required

Executive Summary

Classification is important problem in data mining. Given a data set, classifier generates meaningful description for each class. Decision trees are most effective and widely used classification methods. There are several algorithms for induction of decision trees. These trees are first induced and then prune sub-trees with subsequent pruning phase to improve accuracy and prevent over fitting. In this paper, various pruning methods are discussed with their features and also effectiveness of pruning is evaluated. Accuracy is measured for diabetes and glass dataset with various pruning factors. The experiments are shown for these two datasets for measuring accuracy and size of the tree.

  • Format: PDF
  • Size: 547.89 KB