Member-only story
K Nearest Neighbours — Step By Step Explanation In 5 Minutes
Let’s say we want to perform a classification task using the K Nearest Neighbours (KNN) algorithm — to predict whether a student is part of the math Olympiad (1 if inside, 0 if not) based on 1) math score and 2) science score. To perform classification, we need:
- A test dataset — the students we aim to predict for
- A train dataset — we compare our test dataset to our train dataset
Our toy train dataset:
math science olympiad
90 86 1
81 89 1
70 81 1
79 94 0
73 77 0
68 76 0
Our toy test dataset:
math science olympiad
83 78 1
71 76 0
Our aim — we want to use our KNN algorithm to predict whether the 2 students in our test dataset are inside the math olympiad. (And then verify if our algorithm’s predictions are correct)
The Essence Of K Nearest Neighbours Algorithm
If others who are the most similar to me are from class 0, I am most probably from class 0 too.
In the KNN algorithm, we first assign k
to some number (usually 5, 10 or 15). For each prediction…