Skip to content

Commit 09b3f5a

Browse files
Internal change
PiperOrigin-RevId: 481936708
1 parent 54a70ba commit 09b3f5a

File tree

12 files changed

+2266
-0
lines changed

12 files changed

+2266
-0
lines changed

official/projects/mosaic/README.md

Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
# MOSAIC: Mobile Segmentation via decoding Aggregated Information and encoded Context
2+
3+
[![Paper](http://img.shields.io/badge/Paper-arXiv.2112.11623-B3181B?logo=arXiv)](https://arxiv.org/abs/2112.11623)
4+
5+
This repository is the official implementation of the following
6+
paper.
7+
8+
* [MOSAIC: Mobile Segmentation via decoding Aggregated Information and encoded Context](https://arxiv.org/abs/2112.11623)
9+
10+
## Description
11+
12+
MOSAIC is a neural network architecture for efficient and accurate semantic
13+
image segmentation on mobile devices. MOSAIC is designed using commonly
14+
supported neural operations by diverse mobile hardware platforms for flexible
15+
deployment across various mobile platforms. With a simple asymmetric
16+
encoder-decoder structure which consists of an efficient multi-scale context
17+
encoder and a light-weight hybrid decoder to recover spatial details from
18+
aggregated information, MOSAIC achieves better balanced performance while
19+
considering accuracy and computational cost. Deployed on top of a tailored
20+
feature extraction backbone based on a searched classification network, MOSAIC
21+
achieves a 5% absolute accuracy gain on ADE20K with similar or lower latency
22+
compared to the current industry standard MLPerf mobile v1.0 models and
23+
state-of-the-art architectures.
24+
25+
[MLPerf Mobile v2.0]((https://mlcommons.org/en/inference-mobile-20/)) included
26+
MOSAIC as a new industry standard benchmark model for image segmentation.
27+
Please see details [here](https://mlcommons.org/en/news/mlperf-inference-1q2022/).
28+
29+
You can also refer to the [MLCommons GitHub repository](https://github.com/mlcommons/mobile_open/tree/main/vision/mosaic).
30+
31+
## History
32+
33+
### Oct 13, 2022
34+
35+
* First release of MOSAIC in TensorFlow 2 including checkpoints that have been
36+
pretrained on Cityscapes.
37+
38+
## Maintainers
39+
40+
* Weijun Wang ([weijunw-g](https://github.com/weijunw-g))
41+
* Fang Yang ([fyangf](https://github.com/fyangf))
42+
* Shixin Luo ([luotigerlsx](https://github.com/luotigerlsx))
43+
44+
## Requirements
45+
46+
[![Python](https://img.shields.io/pypi/pyversions/tensorflow.svg?style=plastic)](https://badge.fury.io/py/tensorflow)
47+
[![tf-models-official PyPI](https://badge.fury.io/py/tf-models-official.svg)](https://badge.fury.io/py/tf-models-official)
48+
49+
## Results
50+
51+
The following table shows the mIoU measured on the `cityscapes` dataset.
52+
53+
| Config | Backbone | Resolution | branch_filter_depths | pyramid_pool_bin_nums | mIoU | Download |
54+
|-------------------------|:--------------------:|:----------:|:--------------------:|:---------------------:|:-----:|:--------:|
55+
| Paper reference config | MobileNetMultiAVGSeg | 1024x2048 | [32, 32] | [4, 8, 16] | 75.98 | [ckpt](https://storage.googleapis.com/tf_model_garden/vision/mosaic/MobileNetMultiAVGSeg-r1024-ebf32-nogp.tar.gz)<br>[tensorboard](https://tensorboard.dev/experiment/okEog90bSwupajFgJwGEIw//#scalars) |
56+
| Current best config | MobileNetMultiAVGSeg | 1024x2048 | [64, 64] | [1, 4, 8, 16] | 77.24 | [ckpt](https://storage.googleapis.com/tf_model_garden/vision/mosaic/MobileNetMultiAVGSeg-r1024-ebf64-gp.tar.gz)<br>[tensorboard](https://tensorboard.dev/experiment/l5hkV7JaQM23EXeOBT6oJg/#scalars) |
57+
58+
* `branch_filter_depths`: the number of convolution channels in each branch at
59+
a pyramid level after `Spatial Pyramid Pooling`
60+
* `pyramid_pool_bin_nums`: the number of bins at each level of the `Spatial
61+
Pyramid Pooling`
62+
63+
## Training
64+
65+
It can run on Google Cloud Platform using Cloud TPU.
66+
[Here](https://cloud.google.com/tpu/docs/how-to) is the instruction of using
67+
Cloud TPU. Following the instructions to set up Cloud TPU and
68+
launch training by:
69+
70+
```shell
71+
EXP_TYPE=mosaic_mnv35_cityscapes
72+
EXP_NAME="<experiment-name>" # You can give any name to the experiment.
73+
TPU_NAME="<tpu-name>" # The name assigned while creating a Cloud TPU
74+
MODEL_DIR="gs://<path-to-model-directory>"
75+
# Now launch the experiment.
76+
python3 -m official.projects.mosaic.train \
77+
--experiment=$EXP_TYPE \
78+
--mode=train \
79+
--tpu=$TPU_NAME \
80+
--model_dir=$MODEL_DIR \
81+
--config_file=official/projects/mosaic/configs/experiments/mosaic_mnv35_cityscapes_tdfs_tpu.yaml
82+
```
83+
84+
## Evaluation
85+
86+
Please run this command line for evaluation.
87+
88+
```shell
89+
EXP_TYPE=mosaic_mnv35_cityscapes
90+
EXP_NAME="<experiment-name>" # You can give any name to the experiment.
91+
TPU_NAME="<tpu-name>" # The name assigned while creating a Cloud TPU
92+
MODEL_DIR="gs://<path-to-model-directory>"
93+
# Now launch the experiment.
94+
python3 -m official.projects.mosaic.train \
95+
--experiment=$EXP_TYPE \
96+
--mode=eval \
97+
--tpu=$TPU_NAME \
98+
--model_dir=$MODEL_DIR \
99+
--config_file=official/projects/mosaic/configs/experiments/mosaic_mnv35_cityscapes_tdfs_tpu.yaml
100+
```
101+
102+
## License
103+
104+
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
105+
106+
This project is licensed under the terms of the **Apache License 2.0**.
107+
108+
## Citation
109+
110+
If you want to cite this repository in your work, please consider citing the
111+
paper.
112+
113+
```
114+
@inproceedings{weijun2021mosaic,
115+
title={MOSAIC: Mobile Segmentation via decoding Aggregated Information and
116+
encoded Context},
117+
author={Weijun Wang, Andrew Howard},
118+
journal={arXiv preprint arXiv:2112.11623},
119+
year={2021},
120+
}
121+
```
Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
# Using Tensorflow datasets: 'cityscapes/semantic_segmentation'
2+
# Some expected flags to use with xmanager launcher:
3+
# --experiment_type=mosaic_mnv35_cityscapes
4+
# --tpu_topology=4x4
5+
# mIoU: 77.24%
6+
runtime:
7+
distribution_strategy: 'tpu'
8+
mixed_precision_dtype: 'float32'
9+
task:
10+
model:
11+
num_classes: 19
12+
input_size: [null, null, 3]
13+
backbone:
14+
type: 'mobilenet'
15+
mobilenet:
16+
model_id: 'MobileNetMultiAVGSeg'
17+
output_intermediate_endpoints: true
18+
output_stride: 16
19+
neck:
20+
branch_filter_depths: [64, 64]
21+
conv_kernel_sizes: [3, 5]
22+
pyramid_pool_bin_nums: [1, 4, 8, 16]
23+
dropout_rate: 0.0
24+
head:
25+
num_classes: 19
26+
decoder_input_levels: ['3/depthwise', '2/depthwise']
27+
decoder_stage_merge_styles: ['concat_merge', 'sum_merge']
28+
decoder_filters: [64, 64]
29+
decoder_projected_filters: [19, 19]
30+
norm_activation:
31+
activation: relu
32+
norm_epsilon: 0.001
33+
norm_momentum: 0.99
34+
use_sync_bn: true
35+
init_checkpoint: 'gs://tf_model_garden/vision/mobilenet/v3.5multiavg_seg_float/'
36+
init_checkpoint_modules: 'backbone'
37+
losses:
38+
l2_weight_decay: 1.0e-04
39+
train_data:
40+
output_size: [1024, 2048]
41+
crop_size: [1024, 2048]
42+
input_path: ''
43+
tfds_name: 'cityscapes/semantic_segmentation'
44+
tfds_split: 'train'
45+
is_training: true
46+
global_batch_size: 32
47+
dtype: 'float32'
48+
aug_rand_hflip: true
49+
aug_scale_max: 2.0
50+
aug_scale_min: 0.5
51+
validation_data:
52+
output_size: [1024, 2048]
53+
input_path: ''
54+
tfds_name: 'cityscapes/semantic_segmentation'
55+
tfds_split: 'validation'
56+
is_training: false
57+
global_batch_size: 32
58+
dtype: 'float32'
59+
drop_remainder: false
60+
resize_eval_groundtruth: true
61+
trainer:
62+
optimizer_config:
63+
learning_rate:
64+
polynomial:
65+
decay_steps: 100000
66+
initial_learning_rate: 0.1
67+
power: 0.9
68+
type: polynomial
69+
optimizer:
70+
sgd:
71+
momentum: 0.9
72+
type: sgd
73+
warmup:
74+
linear:
75+
name: linear
76+
warmup_learning_rate: 0
77+
warmup_steps: 925
78+
type: linear
79+
steps_per_loop: 92 # 2975 / 32 = 92
80+
summary_interval: 92
81+
train_steps: 100000
82+
validation_interval: 92
83+
validation_steps: 16 # 500 / 32 = 16
84+
checkpoint_interval: 92
85+
best_checkpoint_export_subdir: 'best_ckpt'
86+
best_checkpoint_eval_metric: 'mean_iou'
87+
best_checkpoint_metric_comp: 'higher'

0 commit comments

Comments
 (0)