Outlier Detection Over Data Set Using Cluster-Based and Distance-Based Approach
Outlier detection is currently very active area of research in data set mining community. Finding outliers in a collection of patterns is a very well-known problem in the data mining field. An outlier is a pattern which is dissimilar with respect to the rest of the patterns in the dataset. The proposed Method for outlier detection uses hybrid approach. The purpose of approach is first to apply clustering algorithm that is k-means which partition the dataset into number of clusters and then find outliers from the each resulting clusters using distance based method. The principle of outliers finding depend on the threshold.