Skip to content

Commit a5e6bb5

Browse files
author
Anna Grebneva
authored
Added LeViT 128s model (#3414)
1 parent e6f2a93 commit a5e6bb5

File tree

11 files changed

+255
-0
lines changed

11 files changed

+255
-0
lines changed

demos/classification_benchmark_demo/cpp/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ omz_converter --list models.lst
5252
* hbonet-0.25
5353
* hbonet-1.0
5454
* inception-resnet-v2-tf
55+
* levit-128s
5556
* mixnet-l
5657
* mobilenet-v1-0.25-128
5758
* mobilenet-v1-1.0-224

demos/classification_benchmark_demo/cpp/models.lst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ googlenet-v4-tf
1616
hbonet-0.25
1717
hbonet-1.0
1818
inception-resnet-v2-tf
19+
levit-128s
1920
mixnet-l
2021
mobilenet-v1-0.25-128
2122
mobilenet-v1-1.0-224

demos/classification_demo/python/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,7 @@ omz_converter --list models.lst
5656
* hbonet-0.25
5757
* hbonet-1.0
5858
* inception-resnet-v2-tf
59+
* levit-128s
5960
* mixnet-l
6061
* mobilenet-v1-0.25-128
6162
* mobilenet-v1-1.0-224

demos/classification_demo/python/models.lst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ googlenet-v4-tf
1818
hbonet-0.25
1919
hbonet-1.0
2020
inception-resnet-v2-tf
21+
levit-128s
2122
mixnet-l
2223
mobilenet-v1-0.25-128
2324
mobilenet-v1-1.0-224

models/public/device_support.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,7 @@
6161
| hybrid-cs-model-mri | YES | | |
6262
| i3d-rgb-tf | YES | YES | |
6363
| inception-resnet-v2-tf | YES | YES | YES |
64+
| levit-128s | YES | YES | |
6465
| license-plate-recognition-barrier-0007 | YES | | |
6566
| mask_rcnn_inception_resnet_v2_atrous_coco | YES | YES | |
6667
| mask_rcnn_resnet50_atrous_coco | YES | YES | |

models/public/index.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,7 @@
6262
omz_models_model_hbonet_0_25
6363
omz_models_model_hbonet_1_0
6464
omz_models_model_inception_resnet_v2_tf
65+
omz_models_model_levit_128s
6566
omz_models_model_mixnet_l
6667
omz_models_model_mobilenet_v1_0_25_128
6768
omz_models_model_mobilenet_v1_1_0_224
@@ -344,6 +345,7 @@ You can download models and convert them into OpenVINO™ IR format (\*.xml + \*
344345
| Inception (GoogleNet) V3 | TensorFlow\*<br>PyTorch\* | [googlenet-v3](./googlenet-v3/README.md) <br> [googlenet-v3-pytorch](./googlenet-v3-pytorch/README.md) | 77.904%/93.808%<br>77.69%/93.7% | 11.469 | 23.817 |
345346
| Inception (GoogleNet) V4 | TensorFlow\* | [googlenet-v4-tf](./googlenet-v4-tf/README.md) | 80.204%/95.21% | 24.584 | 42.648 |
346347
| Inception-ResNet V2 | TensorFlow\* | [inception-resnet-v2-tf](./inception-resnet-v2-tf/README.md) | 77.82%/94.03% | 22.227 | 30.223 |
348+
| LeViT 128S | PyTorch\* | [levit-128s](./levit-128s/README.md) | 76.54%/92.85% | 0.6177 | 8.2199 |
347349
| MixNet L | TensorFlow\* | [mixnet-l](./mixnet-l/README.md) | 78.30%/93.91% | 0.565 | 7.3 |
348350
| MobileNet V1 0.25 128 | Caffe\* | [mobilenet-v1-0.25-128](./mobilenet-v1-0.25-128/README.md) | 40.54%/65% | 0.028 | 0.468 |
349351
| MobileNet V1 1.0 224 | Caffe\*<br>TensorFlow\* | [mobilenet-v1-1.0-224](./mobilenet-v1-1.0-224/README.md)<br>[mobilenet-v1-1.0-224-tf](./mobilenet-v1-1.0-224-tf/README.md)| 69.496%/89.224%<br>71.03%/89.94% | 1.148 | 4.221 |

models/public/levit-128s/README.md

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
# levit-128s
2+
3+
## Use Case and High-Level Description
4+
5+
The `levit-128s` model is one of the LeViT models family: a hybrid neural network for fast inference image classification. The model is pre-trained on the ImageNet dataset. LeViT-128s model is a small LeViT variant that has 128 channels on input of the transformer stage and 2, 3 and 4 number of pairs of Attention and MLP blocks at 1, 2 and 3 model stages respectively.
6+
7+
The model input is a blob that consists of a single image of `1, 3, 224, 224` in `RGB` order.
8+
9+
The model output is typical object classifier for the 1000 different classifications matching with those in the ImageNet database.
10+
11+
For details see [repository](https://github.com/rwightman/pytorch-image-models) and [paper](https://arxiv.org/abs/2104.01136).
12+
13+
## Specification
14+
15+
| Metric | Value |
16+
| ---------------- | -------------- |
17+
| Type | Classification |
18+
| GFLOPs | 0.6177 |
19+
| MParams | 8.2199 |
20+
| Source framework | PyTorch\* |
21+
22+
## Accuracy
23+
24+
| Metric | Value |
25+
| ------ | ----- |
26+
| Top 1 | 76.54% |
27+
| Top 5 | 92.85% |
28+
29+
## Input
30+
31+
### Original model
32+
33+
Image, name - `image`, shape - `1, 3, 224, 224`, format is `B, C, H, W`, where:
34+
35+
- `B` - batch size
36+
- `C` - channel
37+
- `H` - height
38+
- `W` - width
39+
40+
Channel order is `RGB`.
41+
Mean values - [123.675,116.28,103.53], scale values - [58.395, 57.12, 57.375].
42+
43+
### Converted model
44+
45+
Image, name - `image`, shape - `1, 3, 224, 224`, format is `B, C, H, W`, where:
46+
47+
- `B` - batch size
48+
- `C` - channel
49+
- `H` - height
50+
- `W` - width
51+
52+
Channel order is `BGR`.
53+
54+
## Output
55+
56+
### Original model
57+
58+
Object classifier according to ImageNet classes, name - `probs`, shape - `1, 1000`, output data format is `B, C`, where:
59+
60+
- `B` - batch size
61+
- `C` - predicted probabilities for each class in logits format
62+
63+
### Converted model
64+
65+
Object classifier according to ImageNet classes, name - `probs`, shape - `1, 1000`, output data format is `B, C`, where:
66+
67+
- `B` - batch size
68+
- `C` - predicted probabilities for each class in logits format
69+
70+
## Download a Model and Convert it into OpenVINO™ IR Format
71+
72+
You can download models and if necessary convert them into OpenVINO™ IR format using the [Model Downloader and other automation tools](../../../tools/model_tools/README.md) as shown in the examples below.
73+
74+
An example of using the Model Downloader:
75+
```
76+
omz_downloader --name <model_name>
77+
```
78+
79+
An example of using the Model Converter:
80+
```
81+
omz_converter --name <model_name>
82+
```
83+
84+
## Demo usage
85+
86+
The model can be used in the following demos provided by the Open Model Zoo to show its capabilities:
87+
88+
* [Classification Benchmark C++ Demo](../../../demos/classification_benchmark_demo/cpp/README.md)
89+
* [Classification Python\* Demo](../../../demos/classification_demo/python/README.md)
90+
91+
## Legal Information
92+
93+
The original model is distributed under the
94+
[Apache License, Version 2.0](https://raw.githubusercontent.com/rwightman/pytorch-image-models/master/LICENSE).
95+
A copy of the license is provided in `<omz_dir>/models/public/licenses/APACHE-2.0-PyTorch-Image-Models.txt`.
Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
models:
2+
- name: levit-128s-onnx
3+
4+
launchers:
5+
- framework: onnx_runtime
6+
model: levit-128s.onnx
7+
adapter: classification
8+
9+
datasets:
10+
- name: imagenet_1000_classes
11+
reader: pillow_imread
12+
preprocessing:
13+
- type: resize
14+
size: 249
15+
interpolation: BICUBIC
16+
aspect_ratio_scale: greater
17+
use_pillow: True
18+
- type: crop
19+
size: 224
20+
use_pillow: True
21+
- type: normalization
22+
mean: [123.675, 116.28, 103.53]
23+
std: [58.395, 57.12, 57.375]
24+
metrics:
25+
- name: accuracy@top1
26+
type: accuracy
27+
top_k: 1
28+
reference: 0.7654
29+
- name: accuracy@top5
30+
type: accuracy
31+
top_k: 5
32+
reference: 0.9285
33+
34+
- name: levit-128s
35+
36+
launchers:
37+
- framework: openvino
38+
adapter: classification
39+
40+
datasets:
41+
- name: imagenet_1000_classes
42+
reader: pillow_imread
43+
preprocessing:
44+
- type: resize
45+
size: 249
46+
interpolation: BICUBIC
47+
aspect_ratio_scale: greater
48+
use_pillow: True
49+
- type: crop
50+
size: 224
51+
use_pillow: True
52+
- type: rgb_to_bgr
53+
metrics:
54+
- name: accuracy@top1
55+
type: accuracy
56+
top_k: 1
57+
reference: 0.7654
58+
- name: accuracy@top5
59+
type: accuracy
60+
top_k: 5
61+
reference: 0.9285

models/public/levit-128s/model.py

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# Copyright (c) 2022 Intel Corporation
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
from torch import load
16+
from timm.models.levit import levit_128s
17+
18+
19+
def create_levit(weights):
20+
model = levit_128s()
21+
22+
checkpoint = load(weights, map_location='cpu')['model']
23+
model.load_state_dict(checkpoint)
24+
25+
return model

models/public/levit-128s/model.yml

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
# Copyright (c) 2022 Intel Corporation
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
description: >-
16+
The "levit-128s" model is one of the LeViT models family: a hybrid neural network
17+
for fast inference image classification. The model is pre-trained on the ImageNet
18+
dataset. LeViT-128s model is a small LeViT variant that has 128 channels on input
19+
of the transformer stage and 2, 3 and 4 number of pairs of Attention and MLP blocks
20+
at 1, 2 and 3 model stages respectively.
21+
22+
The model input is a blob that consists of a single image of "1, 3, 224, 224" in
23+
"RGB" order.
24+
25+
The model output is typical object classifier for the 1000 different classifications
26+
matching with those in the ImageNet database.
27+
28+
For details see repository <https://github.com/rwightman/pytorch-image-models> and
29+
paper <https://arxiv.org/abs/2104.01136>.
30+
task_type: classification
31+
files:
32+
- name: timm-0.5.4-py3-none-any.whl
33+
size: 431537
34+
checksum: e8f1967a8e2029fe21a43875132b4b123227b718abc35725d7f2b9fd0ef2062884ac3dd558570b51a780aad89bc375d6
35+
source: https://files.pythonhosted.org/packages/49/65/a83208746dc9c0d70feff7874b49780ff110810feb528df4b0ecadcbee60/timm-0.5.4-py3-none-any.whl
36+
- name: LeViT-128S-96703c44.pth
37+
size: 32152063
38+
checksum: ac05427904bc10921aa04e4c5970ce75429e4b77231b6735d584d570f4dfaebd9de42539d2200802f1d5a069e8e0071a
39+
original_source: https://dl.fbaipublicfiles.com/LeViT/LeViT-128S-96703c44.pth
40+
source: https://storage.openvinotoolkit.org/repositories/open_model_zoo/public/2022.2/levit-128s/LeViT-128S-96703c44.pth
41+
postprocessing:
42+
- $type: unpack_archive
43+
format: zip
44+
file: timm-0.5.4-py3-none-any.whl
45+
conversion_to_onnx_args:
46+
- --model-path=$dl_dir
47+
- --model-path=$config_dir
48+
- --model-name=create_levit
49+
- --import-module=model
50+
- --model-param=weights=r"$dl_dir/LeViT-128S-96703c44.pth"
51+
- --input-shape=1,3,224,224
52+
- --input-names=image
53+
- --output-names=probs
54+
- --output-file=$conv_dir/levit-128s.onnx
55+
input_info:
56+
- name: image
57+
shape: [1, 3, 224, 224]
58+
layout: NCHW
59+
model_optimizer_args:
60+
- --input_model=$conv_dir/levit-128s.onnx
61+
- --mean_values=image[123.675,116.28,103.53]
62+
- --scale_values=image[58.395, 57.12, 57.375]
63+
- --reverse_input_channels
64+
- --output=probs
65+
framework: pytorch
66+
license: https://raw.githubusercontent.com/rwightman/pytorch-image-models/master/LICENSE

0 commit comments

Comments
 (0)