Improving Accuracy and Cost of Two-Class and Multi-Class Probabilistic Classifiers Using ROC Curves

Date Added: Jan 2010
Format: PDF

The probability estimates of a naive Bayes classifier are inaccurate if some of its underlying independence assumptions are violated. The decision criterion for using these estimates for classification therefore has to be learned from the data. This paper proposes the use of ROC curves for this purpose. For two classes, the algorithm is a simple adaptation of the algorithm for tracing a ROC curve by sorting the instances according to their predicted probability of being positive. As there is no obvious way to upgrade this algorithm to the multi-class case, the paper proposes a hill-climbing approach which adjusts the weights for each class in a pre-defined order.