VisualComputingInstitute
diff --git a/‎.gitignore‎
Lines changed: 2 additions & 0 deletions b/‎.gitignore‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 134 additions & 8 deletions b/‎README.md‎
Lines changed: 134 additions & 8 deletions
diff --git a/‎common.py‎
Lines changed: 154 additions & 0 deletions b/‎common.py‎
Lines changed: 154 additions & 0 deletions
@@ -0,0 +1,2 @@
+__pycache__
+*.pyc
@@ -2,19 +2,35 @@
 
 Code for reproducing the results of our "In Defense of the Triplet Loss for Person Re-Identification" paper.
 
-Both main authors are currently in an internship.
-We will publish the full training code after our internships, which is end of September 2017.
-(By "Watching" this project on github, you will receive e-mails about updates to this repo.)
-Meanwhile, we provide the pre-trained weights for the TriNet model, as well as some rudimentary example code for using it to compute embeddings, see below.
+We provide the following things:
+- The exact pre-trained weights for the TriNet model as used in the paper, including some rudimentary example code for using it to compute embeddings.
+  See section [Pretrained models](#pretrained-models).
+- A clean re-implementation of the training code that can be used for training your own models/data.
+  See section [Training your own models](#training-your-own-models).
+- A script for evaluation which computes the CMC and mAP of embeddings in an HDF5 ("new .mat") file.
+  See section [Evaluating embeddings](#evaluating-embeddings).
 
-# Pretrained Models
+If you use any of the provided code, please cite:
+```
+@article{HermansBeyer2017Arxiv,
+  title       = {{In Defense of the Triplet Loss for Person Re-Identification}},
+  author      = {Hermans*, Alexander and Beyer*, Lucas and Leibe, Bastian},
+  journal     = {arXiv preprint arXiv:1703.07737},
+  year        = {2017}
+}
+```
+
+
+# Pretrained models
 
-This is a first, simple release. A better more generic script will follow in a few months, but this should be enough to get started trying out our models!
+We provide the exact TriNet model used in the paper, which was implemented in
+[Theano](http://deeplearning.net/software/theano/install.html)
+and
+[Lasagne](http://lasagne.readthedocs.io/en/latest/user/installation.html).
 
 As a first step, download either of these pre-trained models:
 - [TriNet trained on MARS](https://omnomnom.vision.rwth-aachen.de/data/trinet-mars.npz) (md5sum: `72fafa2ee9aa3765f038d06e8dd8ef4b`)
 - [TriNet trained on Market1501](https://omnomnom.vision.rwth-aachen.de/data/trinet-market1501.npz) (md5sum: `5353f95d1489536129ec14638aded3c7`)
-- (LuNet models will follow.)
 
 Next, create a file (`files.txt`) which contains the full path to the image files you want to embed, one filename per line, like so:
 
@@ -23,7 +39,7 @@ Next, create a file (`files.txt`) which contains the full path to the image file
 /path/to/file2.jpg
 ```
 
-Finally, run the `trinet_embed.py` script, passing both the above file and the weights file you want to use, like so:
+Finally, run the `trinet_embed.py` script, passing both the above file and the pretrained model file you want to use, like so:
 
 ```
 python trinet_embed.py files.txt /path/to/trinet-mars.npz
@@ -47,3 +63,113 @@ You can now do meaningful work by comparing these embeddings using the Euclidean
 A couple notes:
 - The script depends on [Theano](http://deeplearning.net/software/theano/install.html), [Lasagne](http://lasagne.readthedocs.io/en/latest/user/installation.html) and [OpenCV Python](http://opencv.org/) (`pip install opencv-python`) being correctly installed.
 - The input files should be crops of a full person standing upright, and they will be resized to `288x144` before being passed to the network.
+
+
+# Training your own models
+
+If you want more flexibility, we now provide code for training your own models.
+This is not the code that was used in the paper (which became a unusable mess),
+but rather a clean re-implementation of it in [TensorFlow](https://www.tensorflow.org/),
+achieving about the same performance.
+
+- **This repository requires at least version 1.4 of TensorFlow.**
+- **The TensorFlow code is Python 3 only and won't work in Python 2!**
+
+## Defining a dataset
+
+A dataset consists of two things:
+
+1. An `image_root` folder which contains all images, possibly in sub-folders.
+2. A dataset `.csv` file describing the dataset.
+
+To create a dataset, you simply create a new `.csv` file for it of the following form:
+
+```
+identity,relative_path/to/image.jpg
+```
+
+Where the `identity` is also often called `PID` (`P`erson `ID`entity) and corresponds to the "class name",
+it can be any arbitrary string, but should be the same for images belonging to the same identity.
+
+The `relative_path/to/image.jpg` is relative to aforementioned `image_root`.
+
+## Training
+
+Given the dataset file, and the `image_root`, you can already train a model.
+The minimal way of training a model is to just call `train.py` in the following way:
+
+```
+python train.py \
+    --train_set data/market1501_train.csv \
+    --image_root /absolute/image/root \
+    --experiment_root ~/experiments/my_experiment
+```
+
+This will start training with all default parameters.
+We recommend writing a script file similar to `market1501_train.sh` where you define all kinds of parameters,
+it is **highly recommended** you tune hyperparameters such as `net_input_{height,width}`, `learning_rate`,
+`decay_start_iteration`, and many more.
+See the top of `train.py` for a list of all parameters.
+
+As a convenience, we store all the parameters that were used for a run in `experiment_root/args.json`.
+
+### Pre-trained initialization
+
+If you want to initialize the model using pre-trained weights, such as done for TriNet,
+you need to specify the location of the checkpoint file through `--initial_checkpoint`.
+
+For most common models, you can download the [checkpoints provided by Google here](https://github.com/tensorflow/models/tree/master/research/slim#pre-trained-models).
+For example, that's where we get our ResNet50 pre-trained weights from,
+and what you should pass as second parameter to `market1501_train.sh`.
+
+## Interrupting and resuming training
+
+Since training can take quite a while, interrupting and resuming training is important.
+You can interrupt training at any time by hitting `Ctrl+C` or sending `SIGINT (2)` or `SIGTERM (15)`
+to the training process; it will finish the current batch, store the model and optimizer state,
+and then terminate cleanly.
+Because of the `args.json` file, you can later resume that run simply by running:
+
+```
+python train.py --experiment_root ~/experiments/my_experiment --resume
+```
+
+The last checkpoint is determined automatically by TensorFlow using the contents of the `checkpoint` file.
+
+## Performance issues
+
+For some reason, current TensorFlow is known to have inconsistent performance and can sometimes become very slow.
+The current only known workaround is to install google's performance-tools and preload tcmalloc:
+
+```
+env LD_PRELOAD=/usr/lib/libtcmalloc_minimal.so.4 python train.py ...
+```
+
+This fixes the issues for us most of the time, but not always.
+If you know more, please open an issue and let us know!
+
+## Out of memory
+
+The setup as described in the paper requires a high-end GPU with a lot of memory.
+If you don't have that, you can still train a model, but you should either use a smaller network,
+or adjust the batch-size, which itself also adjusts learning difficulty, which might change results.
+
+The two arguments for playing with the batch-size are `--batch_p` which controls the number of distinct
+persons in a batch, and `--batch_k` which controls the number of pictures per person.
+We usually lower `batch_p` first.
+
+## Custom network architecture
+
+TODO: Documentation. It's also pretty straightforward.
+
+### The core network
+
+### The network head
+
+## Computing embeddings
+
+TODO: Will be added later.
+
+# Evaluating embeddings
+
+TODO: Will be added later.
@@ -0,0 +1,154 @@
+""" A bunch of general utilities shared by train/embed/eval """
+
+from argparse import ArgumentTypeError
+import os
+
+import numpy as np
+import tensorflow as tf
+
+# Commandline argument parsing
+###
+
+def check_directory(arg, access=os.W_OK, access_str="writeable"):
+    """ Check for directory-type argument validity.
+
+    Checks whether the given `arg` commandline argument is either a readable
+    existing directory, or a createable/writeable directory.
+
+    Args:
+        arg (string): The commandline argument to check.
+        access (constant): What access rights to the directory are requested.
+        access_str (string): Used for the error message.
+
+    Returns:
+        The string passed din `arg` if the checks succeed.
+
+    Raises:
+        ArgumentTypeError if the checks fail.
+    """
+    path_head = arg
+    while path_head:
+        if os.path.exists(path_head):
+            if os.access(path_head, access):
+                # Seems legit, but it still doesn't guarantee a valid path.
+                # We'll just go with it for now though.
+                return arg
+            else:
+                raise ArgumentTypeError(
+                    'The provided string `{0}` is not a valid {1} path '
+                    'since {2} is an existing folder without {1} access.'
+                    ''.format(arg, access_str, path_head))
+        path_head, _ = os.path.split(path_head)
+
+    # No part of the provided string exists and can be written on.
+    raise ArgumentTypeError('The provided string `{}` is not a valid {}'
+                            ' path.'.format(arg, access_str))
+
+
+def writeable_directory(arg):
+    """ To be used as a type for `ArgumentParser.add_argument`. """
+    return check_directory(arg, os.W_OK, "writeable")
+
+
+def readable_directory(arg):
+    """ To be used as a type for `ArgumentParser.add_argument`. """
+    return check_directory(arg, os.R_OK, "readable")
+
+
+def number_greater_x(arg, type_, x):
+    try:
+        value = type_(arg)
+    except ValueError:
+        raise ArgumentTypeError('The argument "{}" is not an {}.'.format(
+            arg, type_.__name__))
+
+    if value > x:
+        return value
+    else:
+        raise ArgumentTypeError('Found {} where an {} greater than {} was '
+            'required'.format(arg, type_.__name__, x))
+
+
+def positive_int(arg):
+    return number_greater_x(arg, int, 0)
+
+
+def nonnegative_int(arg):
+    return number_greater_x(arg, int, -1)
+
+
+def positive_float(arg):
+    return number_greater_x(arg, float, 0)
+
+
+def float_or_string(arg):
+    """Tries to convert the string to float, otherwise returns the string."""
+    try:
+        return float(arg)
+    except (ValueError, TypeError):
+        return arg
+
+
+# Dataset handling
+###
+
+
+def load_dataset(csv_file, image_root, fail_on_missing=True):
+    """ Loads a dataset .csv file, returning PIDs and FIDs.
+
+    PIDs are the "person IDs", i.e. class names/labels.
+    FIDs are the "file IDs", which are individual relative filenames.
+
+    Args:
+        csv_file (string, file-like object): The csv data file to load.
+        image_root (string): The path to which the image files as stored in the
+            csv file are relative to. Used for verification purposes.
+        fail_on_missing (bool): If one or more files from the dataset are not
+            present in the `image_root`, either raise an IOError (if True) or
+            remove it from the returned dataset (if False).
+
+    Returns:
+        (pids, fids) a tuple of numpy string arrays corresponding to the PIDs,
+        i.e. the identities/classes/labels and the FIDs, i.e. the filenames.
+
+    Raises:
+        IOError if any one file is missing and `fail_on_missing` is True.
+    """
+    dataset = np.genfromtxt(csv_file, delimiter=',', dtype='|U')
+    pids, fids = dataset.T
+
+    # Check if all files exist
+    missing = np.full(len(fids), False, dtype=bool)
+    for i, fid in enumerate(fids):
+        missing[i] = not os.path.isfile(os.path.join(image_root, fid))
+
+    missing_count = np.sum(missing)
+    if missing_count > 0:
+        if fail_on_missing:
+            raise IOError('Using the `{}` file and `{}` as an image root {}/'
+                          '{} images are missing'.format(
+                               csv_file, image_root, missing_count, len(fids)))
+        else:
+            print('[Warning] removing {} missing file(s) from the'
+                  ' dataset.'.format(missing_count))
+            # We simply remove the missing files.
+            fids = fids[np.logical_not(missing)]
+            pids = pids[np.logical_not(missing)]
+
+    return pids, fids
+
+
+def fid_to_image(fid, pid, image_root, image_size):
+    """ Loads and resizes an image given by FID. Pass-through the PID. """
+    # Since there is no symbolic path.join, we just add a '/' to be sure.
+    image_encoded = tf.read_file(tf.reduce_join([image_root, '/', fid]))
+
+    # tf.image.decode_image doesn't set the shape, not even the dimensionality,
+    # because it potentially loads animated .gif files. Instead, we use either
+    # decode_jpeg or decode_png, each of which can decode both.
+    # Sounds ridiculous, but is true:
+    # https://github.com/tensorflow/tensorflow/issues/9356#issuecomment-309144064
+    image_decoded = tf.image.decode_jpeg(image_encoded, channels=3)
+    image_resized = tf.image.resize_images(image_decoded, image_size)
+
+    return image_resized, fid, pid