Skip to content

Commit ee35a03

Browse files
Update readme with ShapeMask instructions.
PiperOrigin-RevId: 317896878
1 parent b9a87a5 commit ee35a03

File tree

2 files changed

+136
-4
lines changed

2 files changed

+136
-4
lines changed

official/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,13 +43,15 @@ In the near future, we will add:
4343
|-------|-------------------|
4444
| [MNIST](vision/image_classification) | A basic model to classify digits from the [MNIST dataset](http://yann.lecun.com/exdb/mnist/) |
4545
| [ResNet](vision/image_classification) | [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385) |
46+
| [EfficientNet](vision/image_classification) | [EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks](https://arxiv.org/abs/1905.11946) |
4647

4748
#### Object Detection and Segmentation
4849

4950
| Model | Reference (Paper) |
5051
|-------|-------------------|
5152
| [RetinaNet](vision/detection) | [Focal Loss for Dense Object Detection](https://arxiv.org/abs/1708.02002) |
5253
| [Mask R-CNN](vision/detection) | [Mask R-CNN](https://arxiv.org/abs/1703.06870) |
54+
| [ShapeMask](vision/detection) | [ShapeMask: Learning to Segment Novel Objects by Refining Shape Priors](https://arxiv.org/abs/1904.03239) |
5355

5456
### Natural Language Processing
5557

official/vision/detection/README.md

Lines changed: 134 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -123,8 +123,6 @@ predict:
123123
predict_batch_size: 8
124124
architecture:
125125
use_bfloat16: False
126-
retinanet_parser:
127-
use_bfloat16: False
128126
train:
129127
total_steps: 1
130128
batch_size: 8
@@ -245,8 +243,6 @@ predict:
245243
predict_batch_size: 8
246244
architecture:
247245
use_bfloat16: False
248-
maskrcnn_parser:
249-
use_bfloat16: False
250246
train:
251247
total_steps: 1000
252248
batch_size: 8
@@ -255,6 +251,140 @@ use_tpu: False
255251
"
256252
```
257253

254+
## Train ShapeMask on TPU
255+
256+
### Train a ResNet-50 based ShapeMask.
257+
258+
```bash
259+
TPU_NAME="<your GCP TPU name>"
260+
MODEL_DIR="<path to the directory to store model files>"
261+
RESNET_CHECKPOINT="<path to the pre-trained Resnet-50 checkpoint>"
262+
TRAIN_FILE_PATTERN="<path to the TFRecord training data>"
263+
EVAL_FILE_PATTERN="<path to the TFRecord validation data>"
264+
VAL_JSON_FILE="<path to the validation annotation JSON file>"
265+
SHAPE_PRIOR_PATH="<path to shape priors>"
266+
python3 ~/models/official/vision/detection/main.py \
267+
--strategy_type=tpu \
268+
--tpu=${TPU_NAME} \
269+
--model_dir=${MODEL_DIR} \
270+
--mode=train \
271+
--model=shapemask \
272+
--params_override="{train: { checkpoint: { path: ${RESNET_CHECKPOINT}, prefix: resnet50/ }, train_file_pattern: ${TRAIN_FILE_PATTERN} }, eval: { val_json_file: ${VAL_JSON_FILE}, eval_file_pattern: ${EVAL_FILE_PATTERN} } shapemask_head: {use_category_for_mask: true, shape_prior_path: ${SHAPE_PRIOR_PATH}} }"
273+
```
274+
275+
The pre-trained ResNet-50 checkpoint can be downloaded [here](https://storage.cloud.google.com/cloud-tpu-checkpoints/model-garden-vision/detection/resnet50-2018-02-07.tar.gz).
276+
277+
The shape priors can be downloaded [here]
278+
(https://storage.googleapis.com/cloud-tpu-checkpoints/shapemask/kmeans_class_priors_91x20x32x32.npy)
279+
280+
281+
### Train a custom ShapeMask using the config file.
282+
283+
First, create a YAML config file, e.g. *my_shapemask.yaml*.
284+
This file specifies the parameters to be overridden:
285+
286+
```YAML
287+
# my_shapemask.yaml
288+
train:
289+
train_file_pattern: <path to the TFRecord training data>
290+
total_steps: <total steps to train>
291+
batch_size: <training batch size>
292+
eval:
293+
eval_file_pattern: <path to the TFRecord validation data>
294+
val_json_file: <path to the validation annotation JSON file>
295+
batch_size: <evaluation batch size>
296+
shapemask_head:
297+
shape_prior_path: <path to shape priors>
298+
```
299+
300+
Once the YAML config file is created, you can launch the training using the
301+
following command.
302+
303+
```bash
304+
TPU_NAME="<your GCP TPU name>"
305+
MODEL_DIR="<path to the directory to store model files>"
306+
python3 ~/models/official/vision/detection/main.py \
307+
--strategy_type=tpu \
308+
--tpu=${TPU_NAME} \
309+
--model_dir=${MODEL_DIR} \
310+
--mode=train \
311+
--model=shapemask \
312+
--config_file="my_shapemask.yaml"
313+
```
314+
315+
## Train ShapeMask on GPU
316+
317+
Training on GPU is similar to that on TPU. The major change is the strategy type
318+
(use
319+
"[mirrored](https://www.tensorflow.org/api_docs/python/tf/distribute/MirroredStrategy)"
320+
for multiple GPU and
321+
"[one_device](https://www.tensorflow.org/api_docs/python/tf/distribute/OneDeviceStrategy)"
322+
for single GPU).
323+
324+
Multi-GPUs example (assuming there are 8GPU connected to the host):
325+
326+
```bash
327+
MODEL_DIR="<path to the directory to store model files>"
328+
python3 ~/models/official/vision/detection/main.py \
329+
--strategy_type=mirrored \
330+
--num_gpus=8 \
331+
--model_dir=${MODEL_DIR} \
332+
--mode=train \
333+
--model=shapemask \
334+
--config_file="my_shapemask.yaml"
335+
```
336+
337+
A single GPU example
338+
339+
```bash
340+
MODEL_DIR="<path to the directory to store model files>"
341+
python3 ~/models/official/vision/detection/main.py \
342+
--strategy_type=one_device \
343+
--num_gpus=1 \
344+
--model_dir=${MODEL_DIR} \
345+
--mode=train \
346+
--model=shapemask \
347+
--config_file="my_shapemask.yaml"
348+
```
349+
350+
351+
An example with inline configuration (YAML or JSON format):
352+
353+
```
354+
python3 ~/models/official/vision/detection/main.py \
355+
--model_dir=<model folder> \
356+
--strategy_type=one_device \
357+
--num_gpus=1 \
358+
--mode=train \
359+
--model=shapemask \
360+
--params_override="eval:
361+
eval_file_pattern: <Eval TFRecord file pattern>
362+
batch_size: 8
363+
val_json_file: <COCO format groundtruth JSON file>
364+
train:
365+
total_steps: 1000
366+
batch_size: 8
367+
train_file_pattern: <Eval TFRecord file pattern>
368+
use_tpu: False
369+
"
370+
```
371+
372+
373+
### Run the evaluation (after training)
374+
375+
```
376+
python3 /usr/share/models/official/vision/detection/main.py \
377+
--strategy_type=tpu \
378+
--tpu=${TPU_NAME} \
379+
--model_dir=${MODEL_DIR} \
380+
--mode=eval \
381+
--model=shapemask \
382+
--params_override="{eval: { val_json_file: ${VAL_JSON_FILE}, eval_file_pattern: ${EVAL_FILE_PATTERN}, eval_samples: 5000 } }"
383+
```
384+
385+
`MODEL_DIR` needs to point to the trained path of ShapeMask model.
386+
Change `strategy_type=mirrored` and `num_gpus=1` to run on a GPU.
387+
258388
Note: The JSON groundtruth file is useful for [COCO dataset](http://cocodataset.org/#home) and can be
259389
downloaded from the [COCO website](http://cocodataset.org/#download). For custom dataset, it is unncessary because the groundtruth can be included in the TFRecord files.
260390

0 commit comments

Comments
 (0)