This would provide the ability to serve multiple models, and multiple versions of each model, with a single serving instance.
Details can be seen here: https://www.tensorflow.org/tfx/serving/serving_config#model_server_configuration
In practice, this could be achieved by providing a --model-config option that could be used instead of the --model argument, like below:
nvidia-docker run nmtwizard/opennmt-tf \
--storage_config storages.json \
--model_storage s3_model: \
--model-config /path/to/models.config \
--gpuid 1 \
serve --host 0.0.0.0 --port 5000
Then when calling 'tensorflow_model_server' in the opennmt-tf docker image entrypoint.py, the argument --model-config-file could be used instead of --model-name and --model-base-path.