|
| 1 | +# MOSAIC: Mobile Segmentation via decoding Aggregated Information and encoded Context |
| 2 | + |
| 3 | +[](https://arxiv.org/abs/2112.11623) |
| 4 | + |
| 5 | +This repository is the official implementation of the following |
| 6 | +paper. |
| 7 | + |
| 8 | +* [MOSAIC: Mobile Segmentation via decoding Aggregated Information and encoded Context](https://arxiv.org/abs/2112.11623) |
| 9 | + |
| 10 | +## Description |
| 11 | + |
| 12 | +MOSAIC is a neural network architecture for efficient and accurate semantic |
| 13 | +image segmentation on mobile devices. MOSAIC is designed using commonly |
| 14 | +supported neural operations by diverse mobile hardware platforms for flexible |
| 15 | +deployment across various mobile platforms. With a simple asymmetric |
| 16 | +encoder-decoder structure which consists of an efficient multi-scale context |
| 17 | +encoder and a light-weight hybrid decoder to recover spatial details from |
| 18 | +aggregated information, MOSAIC achieves better balanced performance while |
| 19 | +considering accuracy and computational cost. Deployed on top of a tailored |
| 20 | +feature extraction backbone based on a searched classification network, MOSAIC |
| 21 | +achieves a 5% absolute accuracy gain on ADE20K with similar or lower latency |
| 22 | +compared to the current industry standard MLPerf mobile v1.0 models and |
| 23 | +state-of-the-art architectures. |
| 24 | + |
| 25 | +[MLPerf Mobile v2.0]((https://mlcommons.org/en/inference-mobile-20/)) included |
| 26 | +MOSAIC as a new industry standard benchmark model for image segmentation. |
| 27 | +Please see details [here](https://mlcommons.org/en/news/mlperf-inference-1q2022/). |
| 28 | + |
| 29 | +You can also refer to the [MLCommons GitHub repository](https://github.com/mlcommons/mobile_open/tree/main/vision/mosaic). |
| 30 | + |
| 31 | +## History |
| 32 | + |
| 33 | +### Oct 13, 2022 |
| 34 | + |
| 35 | +* First release of MOSAIC in TensorFlow 2 including checkpoints that have been |
| 36 | + pretrained on Cityscapes. |
| 37 | + |
| 38 | +## Maintainers |
| 39 | + |
| 40 | +* Weijun Wang ([weijunw-g](https://github.com/weijunw-g)) |
| 41 | +* Fang Yang ([fyangf](https://github.com/fyangf)) |
| 42 | +* Shixin Luo ([luotigerlsx](https://github.com/luotigerlsx)) |
| 43 | + |
| 44 | +## Requirements |
| 45 | + |
| 46 | +[](https://badge.fury.io/py/tensorflow) |
| 47 | +[](https://badge.fury.io/py/tf-models-official) |
| 48 | + |
| 49 | +## Results |
| 50 | + |
| 51 | +The following table shows the mIoU measured on the `cityscapes` dataset. |
| 52 | + |
| 53 | +| Config | Backbone | Resolution | branch_filter_depths | pyramid_pool_bin_nums | mIoU | Download | |
| 54 | +|-------------------------|:--------------------:|:----------:|:--------------------:|:---------------------:|:-----:|:--------:| |
| 55 | +| Paper reference config | MobileNetMultiAVGSeg | 1024x2048 | [32, 32] | [4, 8, 16] | 75.98 | [ckpt](https://storage.googleapis.com/tf_model_garden/vision/mosaic/MobileNetMultiAVGSeg-r1024-ebf32-nogp.tar.gz)<br>[tensorboard](https://tensorboard.dev/experiment/okEog90bSwupajFgJwGEIw//#scalars) | |
| 56 | +| Current best config | MobileNetMultiAVGSeg | 1024x2048 | [64, 64] | [1, 4, 8, 16] | 77.24 | [ckpt](https://storage.googleapis.com/tf_model_garden/vision/mosaic/MobileNetMultiAVGSeg-r1024-ebf64-gp.tar.gz)<br>[tensorboard](https://tensorboard.dev/experiment/l5hkV7JaQM23EXeOBT6oJg/#scalars) | |
| 57 | + |
| 58 | +* `branch_filter_depths`: the number of convolution channels in each branch at |
| 59 | + a pyramid level after `Spatial Pyramid Pooling` |
| 60 | +* `pyramid_pool_bin_nums`: the number of bins at each level of the `Spatial |
| 61 | + Pyramid Pooling` |
| 62 | + |
| 63 | +## Training |
| 64 | + |
| 65 | +It can run on Google Cloud Platform using Cloud TPU. |
| 66 | +[Here](https://cloud.google.com/tpu/docs/how-to) is the instruction of using |
| 67 | +Cloud TPU. Following the instructions to set up Cloud TPU and |
| 68 | +launch training by: |
| 69 | + |
| 70 | +```shell |
| 71 | +EXP_TYPE=mosaic_mnv35_cityscapes |
| 72 | +EXP_NAME="<experiment-name>" # You can give any name to the experiment. |
| 73 | +TPU_NAME="<tpu-name>" # The name assigned while creating a Cloud TPU |
| 74 | +MODEL_DIR="gs://<path-to-model-directory>" |
| 75 | +# Now launch the experiment. |
| 76 | +python3 -m official.projects.mosaic.train \ |
| 77 | + --experiment=$EXP_TYPE \ |
| 78 | + --mode=train \ |
| 79 | + --tpu=$TPU_NAME \ |
| 80 | + --model_dir=$MODEL_DIR \ |
| 81 | + --config_file=official/projects/mosaic/configs/experiments/mosaic_mnv35_cityscapes_tdfs_tpu.yaml |
| 82 | +``` |
| 83 | + |
| 84 | +## Evaluation |
| 85 | + |
| 86 | +Please run this command line for evaluation. |
| 87 | + |
| 88 | +```shell |
| 89 | +EXP_TYPE=mosaic_mnv35_cityscapes |
| 90 | +EXP_NAME="<experiment-name>" # You can give any name to the experiment. |
| 91 | +TPU_NAME="<tpu-name>" # The name assigned while creating a Cloud TPU |
| 92 | +MODEL_DIR="gs://<path-to-model-directory>" |
| 93 | +# Now launch the experiment. |
| 94 | +python3 -m official.projects.mosaic.train \ |
| 95 | + --experiment=$EXP_TYPE \ |
| 96 | + --mode=eval \ |
| 97 | + --tpu=$TPU_NAME \ |
| 98 | + --model_dir=$MODEL_DIR \ |
| 99 | + --config_file=official/projects/mosaic/configs/experiments/mosaic_mnv35_cityscapes_tdfs_tpu.yaml |
| 100 | +``` |
| 101 | + |
| 102 | +## License |
| 103 | + |
| 104 | +[](https://opensource.org/licenses/Apache-2.0) |
| 105 | + |
| 106 | +This project is licensed under the terms of the **Apache License 2.0**. |
| 107 | + |
| 108 | +## Citation |
| 109 | + |
| 110 | +If you want to cite this repository in your work, please consider citing the |
| 111 | +paper. |
| 112 | + |
| 113 | +``` |
| 114 | +@inproceedings{weijun2021mosaic, |
| 115 | + title={MOSAIC: Mobile Segmentation via decoding Aggregated Information and |
| 116 | + encoded Context}, |
| 117 | + author={Weijun Wang, Andrew Howard}, |
| 118 | + journal={arXiv preprint arXiv:2112.11623}, |
| 119 | + year={2021}, |
| 120 | +} |
| 121 | +``` |
0 commit comments