Theory: https://arxiv.org/abs/1804.07723.
Nvidia flickr faces dataset: https://github.com/NVlabs/ffhq-dataset.
Celeb-a-hq faces dataset: https://mmlab.ie.cuhk.edu.hk/projects/CelebA/CelebAMask_HQ.html
All the images are augmented with random rotation, shift and sheer. The pipeline for preprocessing images is in create_dataset.py. Tfrecord files with serialized image tensors are to be used. The final functions takes list of names of the files.
For masks generation set of random ellipses, circles and lines is used. Examples are below:
The generator is located in masks_generator.py file. Masks are expected to be pregenerated and stored in tfrecord files.
All the code works on TPU. The model was trained on google TPU-v3 for about 6 hours. Hole loss from the article was replaced by loss between blurred images (using gaussian blur). It was found to be better in some cases.
In addition, the model was found to be successfull on pix2pix task (https://github.com/mojaevr/edges2cats ):