Learn to Learn like a child — A Few Shots Learning for Image Classification

TL;DR; 
- A typical approach for doing Image Classification for k-class classification is to have a lot of sample images for each class, hundreds or even thousands of images. The huge challenge for image collection and image labeling.
- A few shots learning offers a way, in which we can have a query image and reference sample images within k-way n-shot. Each class(in k-way) consists of very few images (n-shot). Then we can find in which class the query image belongs to, by computing the similarities between the feature vector of the query image and the mean features of images in each of the classes.
An Image, source: Google Image Search.
The four categories of animals in the cards, with one image per category. Image Source: Google Image Search.

Traditional Approach for Image Classification

The traditional approach in deep learning for doing Image Classification requires a lot of images for training data to achieve a model with high accuracy. For example, a simple four-class Image Classification requires hundreds of images per class, 250 images on average in this example (Andi Sama, Arfika Nurhudatiana, 2019b).

Learn to Learn with A few Shots Learning

Instead of training a bunch of images for every class, we find the similarity scores between the input image (the “Query”) against a list of images (the “Support set”) with very few samples in each class, as low as just a single sample. Class means the image categories, i.e., anime man, anime woman, real man, and real woman like in the reference example (Andi Sama, Arfika Nurhudatiana, 2019b))

Two images of Anime_Woman. Image Source: Google Image Search.
image_sim = sim(image_1, image_2)
print(image_sim)
1.00
The left image is within the Anime_Woman class, and the right is in the Real_Woman class—image Source: Google Image Search.
image_sim = sim(image_1, image_2)
print(image_sim)
0.00

Query and Support Set

The Query is simply an input image that we want to compare against a list of images within different categories/classes.

An illustration of one Query Image and a 4-way 1 short Support Set. Some of the images are from Google Image Search.
An illustration of one Query Image and a 4-way 2 short Support Set. Some of the images are from Google Image Search.
An example of a query against 4-way Support Set with only one image per class. The highest similarity score between the query image feature vector and the feature vector in a class within the Support Set indicates the chosen class, in this case, “Anime_Man”.
An example of a query against 4-way Support Set with 2 sample images per class. The highest similarity score between the query image feature vector and the respective mean vector in a class within the Support Set indicates the chosen class, in this case, “Real_Woman”.

Training a few Shots Learning model

There are two basic steps to train a few shots model. 1. Pretraining and 2. Making a few shots prediction.

  • Map the images in the Query and Support Set to feature vectors.
  • Map the feature vectors in the same class to obtain the mean (the average) for each class. If the support set has k-classes, then we will have k-mean values.
  • Compare the query feature vector with the mean vector in each k-classes to find the cosine similarities (similarity scores).

Prediction Accuracy

When making predictions using a few shots learning, the more the classes (k in k-way), the lower the prediction accuracy. The more the sample images (n in n-shot) in each of the classes, the higher the prediction accuracy.

References

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store