tensorflow
diff --git a/‎models/hific/README.md
Lines changed: 86 additions & 24 deletions b/‎models/hific/README.md
Lines changed: 86 additions & 24 deletions
diff --git a/‎models/hific/archs.py
Lines changed: 20 additions & 19 deletions b/‎models/hific/archs.py
Lines changed: 20 additions & 19 deletions
diff --git a/‎models/hific/evaluate.py
Lines changed: 30 additions & 10 deletions b/‎models/hific/evaluate.py
Lines changed: 30 additions & 10 deletions
diff --git a/‎models/hific/helpers.py
Lines changed: 23 additions & 0 deletions b/‎models/hific/helpers.py
Lines changed: 23 additions & 0 deletions
@@ -1,5 +1,7 @@
 # High-Fidelity Generative Image Compression
 
+## PRE-RELEASE
+
 <div align="center">
   <a href='https://hific.github.io'>
   <img src='https://hific.github.io/social/thumb.jpg' width="80%"/>
@@ -30,71 +32,131 @@ use more than 2&times; the bitrate.
 We show some images on the [demo page](https://hific.github.io) and we
 release a
 [colab](https://colab.research.google.com/github/tensorflow/compression/blob/master/models/hific/colab.ipynb)
-update for interactively using our models on your own images.
+for interactively using our models on your own images.
+
+## Running models trained by us locally
+
+Use `tfci.py` for locally running our models to encode and decode images:
+
+```bash
+git clone https://github.com/tensorflow/compression
+cd compression/compression/models
+python tfci.py compress <model> <PNG file>
+```
+
+where `model` can be one of `"hific-lo", "hific-mi", "hific-hi"`.
+
+**NOTE**: This is also directly available in the
+[colab](https://colab.research.google.com/github/tensorflow/compression/blob/master/models/hific/colab.ipynb)!
 
 ## Using the code
 
-In addition to `tensorflow_compression`, you need to install [`compare_gan`](https://github.com/google/compare_gan)
-and TensorFlow 1.15:
+
+To use the code, create a conda environment using Python 3.6
+(newer is not supported at the moment), and the following packages.
+
+**NOTE**: We only support CUDA 10.0, Python 3.6, and TensorFlow 1.15.
+TensorFlow must be installed via pip, not conda.
+Any other setup is not going to work (we tested newer versions of Tensorflow
+and Python and they don't work). We're working on a fix.
 
 ```bash
-pip install -r requirements.txt
+conda create --name hific python=3.6 cudatoolkit=10.0 cudnn
+conda activate hific
+pip install tensorflow-gpu==1.15  # Make sure to install TF via pip, not conda!
+pip install git+git://github.com/google/compare_gan@19922d3004b675c1a49c4d7515c06f6f75acdcc8
+pip install tensorflow-compression==1.3
+pip install Pillow
 ```
 
-## Running our models locally
+#### Note on CUDNN Errors
 
-Use `tfci.py` for locally running our models to encode and decode images:
+On some of our test machines, the code crashes with one of "Could not create
+cudnn handle: CUDNN_STATUS_INTERNAL_ERROR", "terminate called after throwing an
+instance of 'std::bad_alloc'",  "Segmentation fault", "Unknown: Failed to get
+convolution algorithm. This is probably because cuDNN".
 
-```python
-python tfci.py compress <model> <PNG file>
+In this case, try setting `TF_FORCE_GPU_ALLOW_GROWTH=true`, e.g.:
+```bash
+TF_FORCE_GPU_ALLOW_GROWTH=true python train.py ...
 ```
 
-where `model` can be one of `"hific-lo", "hific-mi", "hific-hi"`.
+#### Note on Memory Consumption
 
-## Code
+This model trains best on a V100. If you get out-of-memory errors
+("Resource exhausted: OOM"), try lowering the batch size
+(e.g., `--batch_size 6`), or tweak `num_residual_blocks` in `archs.py/Decoder`.
+
+If you get slow training/stalling, try tweaking the `DATASET_NUM_PARALLEL` and
+`DATASET_PREFETCH` constants in `model.py`.
 
-The architecture is defined in `arch.py` , which is used to build the model in
-`model.py`. Our configurations are in `configs.py`.
 
 ### Training your own models.
 
+The architecture is defined in `arch.py`, which is used to build the model from
+`model.py`. Our configurations are in `configs.py`.
+
+
 We release a _simplified_ trainer in `train.py` as a starting point for custom
-training. Note that it's using [LSUN]() from [tfds]() which likely needs to be
-adapted to a bigger dataset to obtain state-of-the-art results (see below).
+training. Note that it's using
+[coco2014](https://cocodataset.org) from
+[tfds](https://www.tensorflow.org/datasets/api_docs/python/tfds) which likely
+needs to be adapted to a bigger dataset to obtain good results
+(see below).
 
 For the paper, we initialize our GAN models from a MSE+LPIPS checkpoint. To
 replicate this, first train a model for MSE + LPIPS only, and then use that as a
 starting point:
-
 ```bash
 # First train a model for MSE+LPIPS:
-python train.py --config mselpips --ckpt_dir ckpts --num_steps 1M
+python train.py --config mselpips --ckpt_dir ckpts/mse_lpips --num_steps 1M
+                --tfds_dataset_name coco2014
 
 # Once that finishes, train a GAN model:
-python train.py --config hific --ckpt_dir ckpts \
-                --init_from ckpts/mselpips --num_steps 1M
+python train.py --config hific --ckpt_dir ckpts/hific \
+                --init_autoencoder_from_ckpt_dir ckpts/mselpips --num_steps 1M
+                --tfds_dataset_name coco2014
 ```
+Additional helpful arguments are `--tfds_dataset_name`,
+and `--tfds_download_dir`, see `--help` for more.
 
-To test a trained model, use `eval.py`:
+Note that TensorBoard summaries will be saved in `--ckpts` as well. By default,
+we create summaries of inputs and reconstructions, which can use a lot of
+memory. Disable with `--no-image-summaries`.
+
+To test a trained model, use `evaluate.py` (it also supports the `--tfds_*`
+flags):
 
 ```bash
-python eval.py --config hific --ckpt_dir ckpts/hific
+python evaluate.py --config hific --ckpt_dir ckpts/hific --out_dir out/ \
+                   --tfds_dataset_name coco2014
 ```
 
 #### Adapting the dataset
 
-You can change to any other TFDS dataset by changing the `tfds_name` flag for
-`build_input`. To train on a custom dataset, you can replace the `_get_dataset`
+You can change to any other TFDS dataset by adapting the `--tfds_dataset_name`,
+`--tfds_feature_key`, and `--tfds_download_dir` flags of `train.py`.
+
+Note that when using TFDS, the dataset first has to be downloaded, which can
+take time. To do this separately, use the following code snippet:
+```python
+import tensorflow_datasets as tfds
+builder = tfds.builder(TFDS_DATASET_NAME, data_dir=TFDS_DOWNLOAD_DIR)
+builder.download_and_prepare()
+```
+
+To train on a custom dataset, you can replace the `_get_dataset`
 call in `train.py`.
 
 ## Citation
 
 If you use the work released here for your research, please cite this paper:
 
 ```
-@inproceedings{mentzer2020hific,
+@article{mentzer2020high,
   title={High-Fidelity Generative Image Compression},
-  author={Fabian Mentzer and George Toderici and Michael Tschannen and Eirikur Agustsson},
+  author={Mentzer, Fabian and Toderici, George and Tschannen, Michael and Agustsson, Eirikur},
+  journal={arXiv preprint arXiv:2006.09965},
   year={2020}
 }
 ```
 
@@ -30,7 +30,6 @@
 
 from .helpers import ModelMode
 
-
 SCALES_MIN = 0.11
 SCALES_MAX = 256
 SCALES_LEVELS = 64
@@ -327,7 +326,13 @@ def __init__(self,
     self._num_layers = num_layers
     self._num_filters_base = num_filters_base
 
+  def __call__(self, x):
+    """Overwriting compare_gan's __call__ as we only need `x`."""
+    with tf.variable_scope(self.name, reuse=tf.AUTO_REUSE):
+      return self.apply(x)
+
   def apply(self, x):
+    """Overwriting compare_gan's apply as we only need `x`."""
     if not isinstance(x, tuple) or len(x) != 2:
       raise ValueError("Expected 2-tuple, got {}".format(x))
     x, latent = x
@@ -398,18 +403,8 @@ def __init__(self,
     self._name = name
     self._model = compare_gan_cls(**compare_gan_kwargs)
 
-  def call(self, x, training):
-    # compare_gan code distinguishes between training and evaluation using
-    # two entry points: __call__ for training, inference for evaluation.
-    # Depending on the mode, different norm modes are set.
-    # We switch depending on the training flag.
-    if training:
-      return self._model(x)
-    else:
-      return self._model.inference(x)
-
-  def apply_directly(self, x):
-    return self._model.apply(x)
+  def call(self, x):
+    return self._model(x)
 
   @property
   def trainable_variables(self):
@@ -418,7 +413,7 @@ def trainable_variables(self):
     # don't have training as a flag to the constructor, so we always return.
     # However, we only call trainable_variables when we are training.
     return tf.get_collection(
-        tf.GraphKeys.TRAINABLE_VARIABLES, scope=self._name)
+        tf.GraphKeys.TRAINABLE_VARIABLES, scope=self._model.name)
 
 
 class Discriminator(_CompareGANLayer):
@@ -433,13 +428,11 @@ class Hyperprior(tf.keras.layers.Layer):
   """Hyperprior architecture (probability model)."""
 
   def __init__(self,
-               round_latents_for_training=True,
                num_chan_bottleneck=220,
                num_filters=320,
                name="Hyperprior"):
     super(Hyperprior, self).__init__(name=name)
 
-    self._round_latents_for_training = round_latents_for_training
     self._num_chan_bottleneck = num_chan_bottleneck
     self._num_filters = num_filters
     self._analysis = tf.keras.Sequential([
@@ -537,9 +530,7 @@ def call(self, latents, image_shape, mode: ModelMode) -> HyperInfo:
 
     compressed = None
     if training:
-      latents_decoded = (entropy_info.quantized
-                         if self._round_latents_for_training else
-                         entropy_info.noisy)
+      latents_decoded = _quantize(latents, latent_means)
     elif validation:
       latents_decoded = entropy_info.quantized
     else:
@@ -565,6 +556,16 @@ def call(self, latents, image_shape, mode: ModelMode) -> HyperInfo:
     return info
 
 
+def _quantize(inputs, mean):
+  half = tf.constant(.5, dtype=tf.float32)
+  outputs = inputs
+  outputs -= mean
+  # Rounding latents for the forward pass (straight-through).
+  outputs = outputs + tf.stop_gradient(tf.math.floor(outputs + half) - outputs)
+  outputs += mean
+  return outputs
+
+
 class FactorizedPriorLayer(tf.keras.layers.Layer):
   """Factorized prior to code a discrete tensor."""
 
 
@@ -20,8 +20,9 @@
 
 import argparse
 import itertools
-
-from absl import app
+import os
+import sys
+from PIL import Image
 
 import tensorflow.compat.v1 as tf
 
@@ -30,27 +31,41 @@
 from . import model
 
 
-def eval_trained_model(config_name, ckpt_dir, max_images=None):
+def eval_trained_model(config_name,
+                       ckpt_dir,
+                       out_dir,
+                       tfds_arguments: helpers.TFDSArguments,
+                       max_images=None):
   """Evaluate a trained model."""
   config = configs.get_config(config_name)
   hific = model.HiFiC(config, helpers.ModelMode.EVALUATION)
 
-  # Automatically uses the validation split of LSUN.
-  dataset = hific.build_input(batch_size=1, crop_size=None, tfds_name='lsun')
+  # Automatically uses the validation split.
+  dataset = hific.build_input(
+      batch_size=1, crop_size=None, tfds_arguments=tfds_arguments)
   iterator = tf.data.make_one_shot_iterator(dataset)
   get_next_image = iterator.get_next()
 
   output_image, bpp = hific.build_model(**get_next_image)
   input_image = get_next_image['input_image']
+
+  input_image = tf.cast(tf.round(input_image[0, ...]), tf.uint8)
+  output_image = tf.cast(tf.round(output_image[0, ...]), tf.uint8)
+
+  os.makedirs(out_dir, exist_ok=True)
+
   with tf.Session() as sess:
     hific.restore_trained_model(sess, ckpt_dir)
     for i in itertools.count(0):
       if max_images and i == max_images:
         break
       try:
-        input_, output_, bpp_ = sess.run([input_image, output_image, bpp])
-        # TODO(fab-jul): Save image, report bpp, etc.
-        print(input_.shape, output_.shape, bpp_)
+        inp_np, otp_np, bpp_np = sess.run([input_image, output_image, bpp])
+        print(f'Image {i}: {bpp_np:.3} bpp, saving in {out_dir}...')
+        Image.fromarray(inp_np).save(
+            os.path.join(out_dir, f'img_{i:010d}inp.png'))
+        Image.fromarray(otp_np).save(
+            os.path.join(out_dir, f'img_{i:010d}otp_{bpp_np:.3f}.png'))
       except tf.errors.OutOfRangeError:
         print('No more inputs')
         break
@@ -67,13 +82,18 @@ def parse_args(argv):
   parser.add_argument('--ckpt_dir', required=True,
                       help=('Path to the folder where checkpoints of the '
                             'trained model are.'))
+  parser.add_argument('--out_dir', required=True, help='Where to save outputs.')
+
+  helpers.add_tfds_arguments(parser)
+
   args = parser.parse_args(argv[1:])
   return args
 
 
 def main(args):
-  eval_trained_model(args.config, args.ckpt_dir)
+  eval_trained_model(args.config, args.ckpt_dir, args.out_dir,
+                     helpers.parse_tfds_arguments(args))
 
 
 if __name__ == '__main__':
-  app.run(main, flags_parser=parse_args)
+  main(parse_args(sys.argv))
@@ -16,6 +16,7 @@
 """Some helper enums and classes, as well as LPIPS downloader."""
 
 
+import collections
 import enum
 import os
 import pprint
@@ -24,6 +25,9 @@
 
 _LPIPS_URL = "http://rail.eecs.berkeley.edu/models/lpips/net-lin_alex_v0.1.pb"
 
+TFDSArguments = collections.namedtuple(
+    "TFDSArguments", ["dataset_name", "features_key", "downloads_dir"])
+
 
 class ModelType(enum.Enum):
   # Train hyperprior model: encoder/decoder/entropy model.
@@ -66,3 +70,22 @@ def ensure_lpips_weights_exist(weight_path_out):
   if not os.path.isfile(weight_path_out):
     raise ValueError(f"Failed to download LPIPS weights from {_LPIPS_URL} "
                      f"to {weight_path_out}. Please manually download!")
+
+
+def add_tfds_arguments(parser):
+  parser.add_argument(
+      "--tfds_dataset_name", default="coco2014", help="TFDS dataset name.")
+  parser.add_argument(
+      "--tfds_downloads_dir",
+      default=None,
+      help=("Where TFDS stores data."
+            "Defaults to ~/tensorflow_datasets."))
+  parser.add_argument(
+      "--tfds_features_key",
+      default="image",
+      help="Name of the TFDS feature to use.")
+
+
+def parse_tfds_arguments(args) -> TFDSArguments:
+  return TFDSArguments(args.tfds_dataset_name, args.tfds_features_key,
+                       args.tfds_downloads_dir)