Mapping my Face in the GAN Latent Space
Progressive Generative Adversarial Networks (GANs) at ICLR 18 this year achieved incredible results on synthetic faces generation. Some of the improvements of this paper are also due to a carefully curated data set of celebrity photos in high resolution (1024 by 1024 pixels). Given a latent vector in a 512-dimensional space, this neural network outputs a very realistic face that resembles famous actors. These are some of the faces a Progressive GAN can generate:
It is also possible to interpolate between faces by linear interpolation of two latent vectors. I found this useful Gist by mat kelcey that uses TensorFlow hub to interpolate between latent vectors. TensorFlow hub provides a collection of pre-trained networks that can be called straight from a python script. These networks do not need to be explicitly downloaded either, which is quite convenient. They provide a trained Progressive GAN that I can even run on my laptop with no GPU. Unfortunately, this has some limitations: its resolution is 128 by 128 pixels and it does not seem to be trained on the same high quality data set of the original paper. Anyway when interpolating between spaces, results are very smooth and natural. For example, when transitioning between a male face with short hair and a female face with long hair, hair naturally grows.
I have been asking myself if it is possible to identify the latent vector that generated a face. It turns out that there exists a very simple technique presented at an ICLR 17 workshop: Precise Recovery of Latent Vectors from Generative Adversarial Networks by Zachary C. Lipton, and Subarna Tripathi.
Recovering a Latent Vector
This problem can be framed as simple optimisation problem that can be solved by gradient descent. Let z be the original unknown latent vector that generated a face and ϕ be the generator. Then we want to find z’ that generates a face that is a close as possible to the original one. That is, we want to solve this optimisation problem:
z’ can be learned by gradient descent as follows:
The workshop paper proposing this technique says that the images generated by the recovered latent vector z’ are indistinguishable from the original. I gave a go at implementing this in TensorFlow and results are very good: the recovered image is almost identical to the original image. However, it does not seem that easy for me to identify the original latent vector. In most of the cases, I obtain a recovered latent vector z’ different from z. Nonetheless their generated images ϕ(z’) and ϕ(z) are almost identical.
Recovering my Face
This technique is very good to recover a latent vector from a face generated by the same GAN. In the case of a random real face, it is not quite possible for this algorithm to identify a good latent vector which generates a close enough version of the input face. This is what I obtain when I use my face as starting point 😅:
The network is flexible enough to generate a face with similar colours and similar features. Nonetheless, the generated face is not the identical reproduction of the original image. It would be really cool to use the latent vector z’ recovered from my face to smoothly transitioning to other faces. Unfortunately the transition is not the smooth: there a noticeable gap when transitioning from my face to the first generated face.
Even using the network with higher resolution (1024 by 1024 pixels) results are not that good. Probably a better result could be obtained by building a face-autoencoder. It might be also possible to obtain more realistic faces using similar technique to train the networks as in Progressive GAN.