International Journal of Emerging Technology in Computer Science and Electronics ( IJETCSE)
Most of the existing anomaly detection methods in data mining are typically implemented in batch mode and hence cannot be scaled to large databases. Many critical applications such as credit card fraud detection need an efficient method to identify the deviated data instances. In the PCA methods designed, the study of outliers is based on the derived directions. But, in PCA methods, the addition or removal of a data instance deviates the target from the principal directions. In this paper, the authors propose an online oversampling PCA where the adding or removing a single outlier instance will not affect the resulting principal direction of the data.