In this blog, we are going to see Generative adversarial networks (GAN). A generative adversarial network is a class of machine learning frameworks used for training generative models. Generative models create new data instances that resemble the training data.
Given a training set, a GAN learns to generate new data
with the same statistics as the training set. GANs much depend on the training loss of
the model, the model tries to minimize loss to generate as real images as possible.
Table
of content
1) What is GAN and How it works?
2) What is Conditional GAN?
3) Advantages of cGAN
4) Pictorial explanation
5) Use-cases
1) What
is GAN and How it works?
GAN is a generative model which achieves a high level
of realism by pairing a generator with a discriminator.
The generator learns to produce the target output,
while the discriminator learns to distinguish true data from the output
of the generator.
To give you an analogy, the generator tries to fool the
discriminator, and the discriminator tries to keep from being fooled.
Initially, we train the discriminator with a known dataset.
The training involves presenting the discriminator with samples from the training
dataset until it achieves acceptable accuracy. Then, the generator trains based
on if it is successful in fooling the discriminator. Typically, the generator
is fed with randomized input that is sampled from a predefined latent space (e.g. a multivariate normal distribution). Consequently,
the output generated by the generator is evaluated by the discriminator.
- Generator
network -Takes as input random vector, and decodes it into a
synthetic image
- Discriminator
network – Takes as input an image, and predicts whether the image
came from the training set or was created by the generator network.
(Citation: Deep
learning with python by Francois Chollet)
GANs rely on back-propagation on both networks to minimize
the errors so that the generator produces better images, while the discriminator
becomes most skilled at flagging synthetic images. Typically, the discriminator
is a convolution neural network, and the generator is a deconvolutional neural
network.
Although
originally proposed as a form of a generative model for unsupervised learning,
GANs have also proven useful for supervised
learning and reinforcement
learning.
2) What are conditional GANs?
A conditional generative adversarial network, or
cGAN for short, is a type of GAN that involves the conditional generation of
images by a generator model.
In our previous section, we have explored GANs
where we have no control over the type of output that is being produced. Unlike
most generative network architectures, cGANs are not completely unsupervised in
their training methods. These cGAN network architectures require some kind of class
labels or labeled data to perform the desired action. Let us understand
the difference between a simple GAN architecture and a cGAN architecture with
some mathematical formulas. Firstly, let us explore the mathematical expression
for the GAN structure, as shown below.
With a minor
modification to our previous formula of the simple GAN architecture, we have
now added a y-label to both the discriminator and the generator network. By
converting the previous probabilities into conditional probabilities
with the addition of the 'y'-labels, we can ensure that the training generator
and discriminator networks are now trained only for the respective label.
Hence, we can send a particular input label and receive the desired output from
the generative network once the training procedure is complete.
Both the generator
and discriminator networks will have these labels assigned to them during the
training process. Hence, both these networks of the cGAN architecture are
trained conditionally such that the generator generates only outputs similar to
the expected label output, while the discriminator model ensures to check if
the generated output is real or fake alongside checking if the image matches
the particular label.
3) Advantages
of cGANs
By providing
additional information to the model, we get two benefits:
- Convergence will be faster. Even the random
distribution that the fake images follow will have some pattern.
- You can control the output of the generator at
test time by giving the label for the image you want to generate.
4) Explanation
with picture
If
that was confusing, consider this example for gaining more understanding:
Suppose
you train a GAN on hand-written digits (MNIST dataset). You normally cannot
control what specific images the generator will produce. In other words, there
is no way you can request a particular digit image from the generator.
This
is where the cGANs come in as we can add an additional input layer of
one-hot-encoded image labels. This additional layer guides the generator in
terms of which image to produce.
The
input to the additional layer can be a feature vector derived from either an image
that encodes the class or a set of specific characteristics we expect from the
image.
Conditional
generative adversarial networks are not strictly unsupervised learning
algorithms because they require labeled data as input to the additional layer.
5) Use
cases of cGANs
· Image-to-image translation
·
Text to image synthesis
·
Video generation
·
Convolutional face generation
Comments
Post a Comment