Big Data

Combining Akaike's Information Criterion (AIC) and the Golden-Section Search Technique to Find Optimal Numbers of K-Nearest Neighbors

Date Added: May 2010
Format: PDF

K-Nearest Neighbor (KNN) is one of the accepted classification tool. Classification is one of the foremost machine-learning tools used in field of medical data mining. However, one of the most complicated tasks in developing a KNN is determining the optimal number of nearest neighbors, which is usually obtained by repeated experiments for different values of K, till the minimum error rate is achieved. This paper describes the novel approach of finding optimal number of nearest neighbors for KNN classifier by combining Akaike's Information Criterion (AIC) and the golden-section search technique. The optimal model so developed was used for categorization of a variety of medical data garnered from UC Irvine Machine Learning Repository.