Estimating False Negatives for Classification Problems With Cluster Structure

Source: University of Minnesota

Favorite

Free registration required

Estimating the number of false negatives for a classifier when the true outcome of the classification is ascertained only for a limited number of instances is an important problem, with a wide range of applications from epidemiology to computer/network security. The frequently applied method is random sampling. However, when the target (positive) class of the classification is rare, which is often the case with network intrusions and diseases, this simple method results in excessive sampling. In this paper, the authors propose an approach that exploits the cluster structure of the data to significantly reduce the amount of sampling needed while guaranteeing an estimation accuracy set forth by the user.
Format:PDF Size:149.60
Date:Jan 2007