optimizationmachine-learningknn

What parameters to optimize in KNN?


I want to optimize KNN. There is a lot about SVM, RF and XGboost; but very few for KNN.

As far as I know the number of neighbors is one parameter to tune.

But what other parameters to test? Is there any good article?

Thank you


Solution

  • KNN is so simple method that there is pretty much nothing to tune besides K. The whole method is literally:

    for a given test sample x:
       - find K most similar samples from training set, according to similarity measure s
       - return the majority vote of the class from the above set
    

    Consequently the only thing used to define KNN besides K is the similarity measure s, and that's all. There is literally nothing else in this algorithm (as it has 3 lines of pseudocode). On the other hand finding "the best similarity measure" is equivalently hard problem as learning a classifier itself, thus there is no real method of doing so, and people usually end up using either simple things (Euclidean distance) or use their domain knowledge to adapt s to the problem at hand.