Provided by: Science and Development Network (SciDev.Net)
Topic: Big Data
In this paper, a new splitting criterion to build a decision tree is proposed. Splitting criterion specifies the best splitting variable and its threshold for further splitting in a tree. Giving the idea from classical forward selection method and its enhanced versions, the variable having the largest absolute correlation with the target value is chosen as the best splitting variable in each node. Then, the idea of maximizing the margin between classes in SVM is used to find the best threshold on the selected variable to classify the data. This procedure will execute recursively in each node, until reaching the leaf nodes.