Using ANOVA to Analyze Modified Gini Index Decision Tree Classification
Source: Lamar University
Decision tree classification is a commonly used method in data mining. It has been used for predicting medical diagnoses. Among data mining methods for classification, decision trees have several advantages such as they are simple to understand and interpret; they are able to handle both numerical and categorical attributes. However, it is well-known that when Gini index is used for classification, the method biases multivalued attributes. In addition to having difficulty when the number of classes is large, the method also tends to favor tests that result in equal-sized partitions and purity in all partitions.