An Empirical Analysis of Imbalanced Data Classification
SVM has been given top consideration for addressing the challenging problem of data imbalance learning. Here, the authors conduct an empirical classification analysis of new UCI datasets that have different imbalance ratios, sizes and complexities. The experimentation consists of comparing the classification results of SVM with two other popular classifiers, naive bayes and decision tree C4.5, to explore their pros and cons. To make the comparative experiments more comprehensive and have a better idea about the learning performance of each classifier, they employ in total four performance metrics: sensitive, specificity, G-means and time-based efficiency.