Recovering the original image from the image having noise using KNN(MNIST handwritten digit classification dataset)

Published in

Nerd For Tech

3 min readApr 1, 2020

Introduction: We will be demonstrating how to use KNN for multi-class and multi-output classification. Also, we will be discussing, how to add noise to an image and then recover the original image using KNN.

Information about the dataset: It is a dataset of 60,000 small square 28×28 pixel grayscale images of handwritten single digits between 0 and 9. The task is to classify a given image of a handwritten digit into one of 10 classes representing integer values from 0 to 9, inclusively.
Link: https://www.kaggle.com/c/digit-recognizer/data

Prerequisites: Machine Learning, knowledge of K-Nearest-Neighbors Algorithm

We will be starting with importing the dataset.

2. Now, since we already have the training dataset and test dataset divided. So, we will be defining X_train and y_train.

The X_train contains original pixel values of the handwritten digits.

3. We will be plotting an original image using matplotlib. As, it is clear that the digit is ‘4’. When we verified for the label of the digit using y_train, we found the respective digit is associated with label ‘4’ respectively.

Plotting the digit and then checking for it’s label.

5. Now, we will be introducing the noise in the pixel intensities of the original digits using the randint() function of numpy library. numpy.random.randint() is one of the functions for doing random sampling in NumPy. It returns an array of specified shape and fills it with random integers from low (inclusive) to high (exclusive), i.e. in the interval [low, high).

6. Let’s load the test dataset

We have loaded the test dataset and defined X_test

7. We will be adding noise to our dataset

8. Let us take a peek of our dataset

9. Our next step will be training a KNN model for multi-output classification. Here, we will X_train_mod is the original X_test to which the noise has been added or in other words, the noise has been added to their pixel intensities. The y_train and y_test are the original X_train and X_test respectively.

10. The final step will be cleaning the image to get our original image.

Recovering the original image from the image having noise using KNN(MNIST handwritten digit classification dataset)

Happy day folks!

Written by Akshat Dubey