Added ConvNeXt Tiny model (#3419)

Anna Grebneva · web-flow · commit 932293f66cce · 2022-03-31T17:03:24.000+03:00
diff --git a/demos/classification_benchmark_demo/cpp/README.md b/demos/classification_benchmark_demo/cpp/README.md
@@ -37,6 +37,7 @@ omz_converter --list models.lst
 
 * alexnet
 * caffenet
+* convnext-tiny
 * densenet-121
 * densenet-121-tf
 * dla-34
diff --git a/demos/classification_benchmark_demo/cpp/models.lst b/demos/classification_benchmark_demo/cpp/models.lst
@@ -1,6 +1,7 @@
 # This file can be used with the --list option of the model downloader.
 alexnet
 caffenet
+convnext-tiny
 densenet-121
 densenet-121-tf
 dla-34
diff --git a/demos/classification_demo/python/README.md b/demos/classification_demo/python/README.md
@@ -39,6 +39,7 @@ omz_converter --list models.lst
 
 * alexnet
 * caffenet
+* convnext-tiny
 * densenet-121
 * densenet-121-tf
 * dla-34
diff --git a/demos/classification_demo/python/models.lst b/demos/classification_demo/python/models.lst
@@ -1,6 +1,7 @@
 # This file can be used with the --list option of the model downloader.
 alexnet
 caffenet
+convnext-tiny
 densenet-121
 densenet-121-tf
 dla-34
diff --git a/models/public/convnext-tiny/README.md b/models/public/convnext-tiny/README.md
@@ -0,0 +1,95 @@
+# convnext-tiny
+
+## Use Case and High-Level Description
+
+The `convnext-tiny` model is tiny version of ConvNeXt model, constructed entirely from standard ConvNet modules. ConvNeXt is accurate, efficient, scalable and very simple in design. The model is pre-trained for image classification task on the ImageNet dataset.
+
+The model input is a blob that consists of a single image of `1, 3, 224, 224` in `RGB` order.
+
+The model output is typical object classifier for the 1000 different classifications matching with those in the ImageNet database.
+
+For details see [repository](https://github.com/rwightman/pytorch-image-models) and [paper](https://arxiv.org/abs/2201.03545).
+
+## Specification
+
+| Metric           | Value          |
+| ---------------- | -------------- |
+| Type             | Classification |
+| GFLOPs           | 8.9419         |
+| MParams          | 28.5892        |
+| Source framework | PyTorch\*      |
+
+## Accuracy
+
+| Metric | Value  |
+| ------ | -----  |
+| Top 1  | 82.05% |
+| Top 5  | 95.86% |
+
+## Input
+
+### Original model
+
+Image, name - `image`,  shape - `1, 3, 224, 224`, format is `B, C, H, W`, where:
+
+- `B` - batch size
+- `C` - channel
+- `H` - height
+- `W` - width
+
+Channel order is `RGB`.
+Mean values - [123.675,116.28,103.53], scale values - [58.395, 57.12, 57.375].
+
+### Converted model
+
+Image, name - `image`,  shape - `1, 3, 224, 224`, format is `B, C, H, W`, where:
+
+- `B` - batch size
+- `C` - channel
+- `H` - height
+- `W` - width
+
+Channel order is `BGR`.
+
+## Output
+
+### Original model
+
+Object classifier according to ImageNet classes, name - `probs`,  shape - `1, 1000`, output data format is `B, C`, where:
+
+- `B` - batch size
+- `C` - predicted probabilities for each class in logits format
+
+### Converted model
+
+Object classifier according to ImageNet classes, name - `probs`,  shape - `1, 1000`, output data format is `B, C`, where:
+
+- `B` - batch size
+- `C` - predicted probabilities for each class in logits format
+
+## Download a Model and Convert it into OpenVINO™ IR Format
+
+You can download models and if necessary convert them into OpenVINO™ IR format using the [Model Downloader and other automation tools](../../../tools/model_tools/README.md) as shown in the examples below.
+
+An example of using the Model Downloader:
+```
+omz_downloader --name <model_name>
+```
+
+An example of using the Model Converter:
+```
+omz_converter --name <model_name>
+```
+
+## Demo usage
+
+The model can be used in the following demos provided by the Open Model Zoo to show its capabilities:
+
+* [Classification Benchmark C++ Demo](../../../demos/classification_benchmark_demo/cpp/README.md)
+* [Classification Python\* Demo](../../../demos/classification_demo/python/README.md)
+
+## Legal Information
+
+The original model is distributed under the
+[Apache License, Version 2.0](https://raw.githubusercontent.com/rwightman/pytorch-image-models/master/LICENSE).
+A copy of the license is provided in `<omz_dir>/models/public/licenses/APACHE-2.0-PyTorch-Image-Models.txt`.
diff --git a/models/public/convnext-tiny/accuracy-check.yml b/models/public/convnext-tiny/accuracy-check.yml
@@ -0,0 +1,61 @@
+models:
+  - name: convnext-tiny-onnx
+
+    launchers:
+      - framework: onnx_runtime
+        model: convnext-tiny.onnx
+        adapter: classification
+
+    datasets:
+      - name: imagenet_1000_classes
+        reader: pillow_imread
+        preprocessing:
+          - type: resize
+            size: 256
+            interpolation: BICUBIC
+            aspect_ratio_scale: greater
+            use_pillow: True
+          - type: crop
+            size: 224
+            use_pillow: True
+          - type: normalization
+            mean: [123.675, 116.28, 103.53]
+            std: [58.395, 57.12, 57.375]
+        metrics:
+          - name: accuracy@top1
+            type: accuracy
+            top_k: 1
+            reference: 0.8205
+          - name: accuracy@top5
+            type: accuracy
+            top_k: 5
+            reference: 0.9586
+
+  - name: convnext-tiny
+
+    launchers:
+      - framework: openvino
+        adapter: classification
+
+    datasets:
+      - name: imagenet_1000_classes
+        reader: pillow_imread
+        preprocessing:
+          - type: resize
+            size: 256
+            interpolation: BICUBIC
+            aspect_ratio_scale: greater
+            use_pillow: True
+          - type: crop
+            size: 224
+            use_pillow: True
+          - type: rgb_to_bgr
+        metrics:
+          - name: accuracy@top1
+            type: accuracy
+            top_k: 1
+            reference: 0.8205
+          - name: accuracy@top5
+            type: accuracy
+            top_k: 5
+            reference: 0.9586
diff --git a/models/public/convnext-tiny/model.py b/models/public/convnext-tiny/model.py
@@ -0,0 +1,26 @@
+# Copyright (c) 2022 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from torch import load
+from timm.models.convnext import convnext_tiny, checkpoint_filter_fn
+
+
+def create_convnext(weights):
+    model = convnext_tiny()
+
+    checkpoint = load(weights, map_location='cpu')
+    ckpt = checkpoint_filter_fn(checkpoint, model)
+    model.load_state_dict(ckpt)
+
+    return model
diff --git a/models/public/convnext-tiny/model.yml b/models/public/convnext-tiny/model.yml
@@ -0,0 +1,65 @@
+# Copyright (c) 2022 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+description: >-
+  The "convnext-tiny" model is tiny version of ConvNeXt model, constructed entirely
+  from standard ConvNet modules. ConvNeXt is accurate, efficient, scalable and very
+  simple in design. The model is pre-trained for image classification task on the
+  ImageNet dataset.
+
+  The model input is a blob that consists of a single image of "1, 3, 224, 224" in
+  "RGB" order.
+
+  The model output is typical object classifier for the 1000 different classifications
+  matching with those in the ImageNet database.
+
+  For details see repository <https://github.com/rwightman/pytorch-image-models> and
+  paper <https://arxiv.org/abs/2201.03545>.
+task_type: classification
+files:
+  - name: timm-0.5.4-py3-none-any.whl
+    size: 431537
+    checksum: e8f1967a8e2029fe21a43875132b4b123227b718abc35725d7f2b9fd0ef2062884ac3dd558570b51a780aad89bc375d6
+    source: https://files.pythonhosted.org/packages/49/65/a83208746dc9c0d70feff7874b49780ff110810feb528df4b0ecadcbee60/timm-0.5.4-py3-none-any.whl
+  - name: convnext_tiny_1k_224_ema.pth
+    size: 114414741
+    checksum: f277194ca9561079ea0519fffc204820922ae46dc865b9def7208cf5993382d6cced681d3d26f6b70c31bf7cec4aba62
+    original_source: https://dl.fbaipublicfiles.com/convnext/convnext_tiny_1k_224_ema.pth
+    source: https://storage.openvinotoolkit.org/repositories/open_model_zoo/public/2022.2/convnext-tiny/convnext_tiny_1k_224_ema.pth
+postprocessing:
+  - $type: unpack_archive
+    format: zip
+    file: timm-0.5.4-py3-none-any.whl
+conversion_to_onnx_args:
+  - --model-path=$dl_dir
+  - --model-path=$config_dir
+  - --model-name=create_convnext
+  - --import-module=model
+  - --model-param=weights=r"$dl_dir/convnext_tiny_1k_224_ema.pth"
+  - --input-shape=1,3,224,224
+  - --input-names=image
+  - --output-names=probs
+  - --output-file=$conv_dir/convnext-tiny.onnx
+input_info:
+  - name: image
+    shape: [1, 3, 224, 224]
+    layout: NCHW
+model_optimizer_args:
+  - --input_model=$conv_dir/convnext-tiny.onnx
+  - --mean_values=image[123.675,116.28,103.53]
+  - --scale_values=image[58.395, 57.12, 57.375]
+  - --reverse_input_channels
+  - --output=probs
+framework: pytorch
+license: https://raw.githubusercontent.com/rwightman/pytorch-image-models/master/LICENSE
diff --git a/models/public/device_support.md b/models/public/device_support.md
@@ -15,6 +15,7 @@
 | colorization-siggraph | YES | YES | YES |
 | colorization-v2 | YES | YES | YES |
 | common-sign-language-0001 | YES | YES |    |
+| convnext-tiny | YES |    |    |
 | ctdet_coco_dlav0_512 | YES | YES | YES |
 | ctpn | YES | YES | YES |
 | deblurgan-v2 | YES | YES | YES |
diff --git a/models/public/index.md b/models/public/index.md
@@ -45,6 +45,7 @@
    omz_models_model_alexnet
    omz_models_model_anti_spoof_mn3
    omz_models_model_caffenet
+   omz_models_model_convnext_tiny
    omz_models_model_densenet_121
    omz_models_model_densenet_121_tf
    omz_models_model_dla_34
@@ -337,6 +338,7 @@ You can download models and convert them into OpenVINO™ IR format (\*.xml + \*
 | AlexNet                     | Caffe\*                            | [alexnet](./alexnet/README.md)   | 56.598%/79.812% | 1.5 | 60.965 |
 | AntiSpoofNet                | PyTorch\*                          | [anti-spoof-mn3](./anti-spoof-mn3/README.md) | 3.81% | 0.15 | 3.02 |
 | CaffeNet                    | Caffe\*                            | [caffenet](./caffenet/README.md)  | 56.714%/79.916% | 1.5 | 60.965 |
+| ConvNeXt Tiny               | PyTorch\*                          | [convnext-tiny](./convnext-tiny/README.md) | 82.05%/95.86% | 8.9419 | 28.5892 |
 | DenseNet 121                | Caffe\*<br>TensorFlow\*| [densenet-121](./densenet-121/README.md)<br>[densenet-121-tf](./densenet-121-tf/README.md)| 74.42%/92.136%<br>74.46%/92.13%| 5.723~5.7287 | 7.971 |
 | DLA 34                      | PyTorch\*                          | [dla-34](./dla-34/README.md) | 74.64%/92.06% | 6.1368 | 15.7344 |
 | EfficientNet B0             | TensorFlow\*<br>PyTorch\*          | [efficientnet-b0](./efficientnet-b0/README.md)<br>[efficientnet-b0-pytorch](./efficientnet-b0-pytorch/README.md) | 75.70%/92.76%<br>77.70%/93.52% | 0.819 | 5.268 |
diff --git a/tools/accuracy_checker/configs/convnext-tiny.yml b/tools/accuracy_checker/configs/convnext-tiny.yml
@@ -0,0 +1 @@
+../../../models/public/convnext-tiny/accuracy-check.yml