Semi-Supervised Learning using Variational Auto-Encoders

2 minute read


As discussed in the meeting, to improve the accuracy of the classification task, it is required to have a semi-supervised learning. Hence, the following paper provided a cue to this kind of approach:
Resisting Adversarial Attacks Using Gaussian Mixture Variational Autoencoders

The paper proposed a solution for semi-supervised learning which is based on variational autoencoders (VAEs) approach. The current model of VAE is defined as:

enter image description here

Semi-Supervised Learning using VAE

The approach leverages the variational lower bound. Maximizing the ELBO is equivalent to minimizing the KL divergence, this is what we need.

There can be two approaches to this :

  • First approach :
    • After training VAE, we can use encoder to directly for learning.
    • We can train a VAE using all the data points (labelled and unlabelled), and transform the observed data (X) into the latent space defined by the Z variables.
    • Then ,we can solve a standard supervised learning problem on the labelled data using (Z,Y) pairs (where Y is the label).
  • Second approach :

    We can also take the labels explicitly into account for out training, this is a better approach. The pipeline can be given as follows :

enter image description here

  • Where y is a one-hot encoded categorical variable representing our class labels, whose relative probabilities are parameterized by π.
  • After the mathematical derivations, the loss functions can be defined as :

enter image description here

  • Thus, we have addressed two cases: one where we observe the y labels and one where we don’t.

Classifier :

For the classifier part, I will be using my VGG16 fine-tuned model. Code with the architecture :

enter image description here

While the encoders decoders will be 4 convolutional layers and 2 dense layers.

Some Concerns :

We have 2 classes (nazi-elements and non-nazi-elements), however, there are various sub-classes in them (like nazi-salute, propganda, cross-grenade, etc:), which are very different from each other. So, in my opinion it will be better to classify based on the sub-classes along with the main classes rather than a binary classifier only. This part needs to be discussed.

Also, the VAE parts given in the papers are mainly directed towards MNIST and SVHN datasets which are very simple but our dataset is a bit complex and contains less examples. So, this can cause issues with accuracy and reconstructions.

References :

  • etc :