Skip to content

Commit 1469a5b

Browse files
committed
Updates:
- Minor bug fix for the Object Detection (OD) use case preprocessing. - Updated dataset naming of the Semantic Segmentation (SS) use case in YAML files and READMEs. Signed-off-by: khaoula boutiche <[email protected]>
1 parent 291581a commit 1469a5b

File tree

20 files changed

+49
-48
lines changed

20 files changed

+49
-48
lines changed

human_activity_recognition/src/user_config.yaml

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -60,12 +60,12 @@ training:
6060
# trained_model_path: trained.h5 # Optional, use it if you want to save the best model at the end of the training to a path of your choice
6161

6262
tools:
63-
stm32ai:
64-
version: 8.1.0
65-
optimization: balanced
66-
on_cloud: True
67-
path_to_stm32ai: C:/Users/XXX/STM32Cube/Repository/Packs/STMicroelectronics/X-CUBE-AI/8.1.0/Utilities/windows/stm32ai.exe
68-
path_to_cubeIDE: C:/ST/STM32CubeIDE_1.1.0/STM32CubeIDE/stm32cubeide.exe
63+
stedgeai:
64+
version: 9.1.0
65+
optimization: balanced
66+
on_cloud: True
67+
path_to_stedgeai: C:/Users/<XXXXX>/STM32Cube/Repository/Packs/STMicroelectronics/X-CUBE-AI/<*.*.*>/Utilities/windows/stedgeai.exe
68+
path_to_cubeIDE: C:/ST/STM32CubeIDE_1.15.0/STM32CubeIDE/stm32cubeide.exe
6969

7070
benchmarking:
7171
board: B-U585I-IOT02A

object_detection/src/preprocessing/tiny_yolo_v2_preprocess.py

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -492,11 +492,12 @@ def tiny_yolo_v2_preprocess(cfg):
492492
# Load the quantization dataset if provided
493493
if quantization_path or training_path:
494494

495-
if Path(cfg.general.model_path).suffix =='.onnx':
496-
_, ish = get_model_name_and_its_input_shape(cfg.general.model_path)
497-
input_shape = [ish[1],ish[2],ish[0]]
498-
else:
499-
_, input_shape = get_model_name_and_its_input_shape(cfg.general.model_path)
495+
if cfg.general.model_path:
496+
if Path(cfg.general.model_path).suffix =='.onnx':
497+
_, ish = get_model_name_and_its_input_shape(cfg.general.model_path)
498+
input_shape = [ish[1],ish[2],ish[0]]
499+
else:
500+
_, input_shape = get_model_name_and_its_input_shape(cfg.general.model_path)
500501

501502
img_width, img_height, _ = input_shape
502503
if training_path:

semantic_segmentation/pretrained_models/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -33,10 +33,10 @@ IoU are averaged on all classes including background.
3333

3434
| Models | Implementation | Dataset | Input Resolution | Accuracy (%) | average IoU | Activation RAM (MiB) | Weights Flash (MiB) | STM32Cube.AI version | Source |
3535
|-------------------------------------------------------|----------------|------------------------|------------------|--------------|-------------|----------------------|---------------------|---------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|
36-
| deeplabv3_257_int8_per_tensor | TensorFlow | PASCAL VOC + COCO 2012 | 257x257x3 | 88.66 | 59.06 | 25.7 | 0.86 | 9.1.0 | Available in X-LINUX-AI package [link](https://www.st.com/en/embedded-software/x-linux-ai.html) |
37-
| deeplab_v3_mobilenetv2_05_fft_float32 | Tensorflow | PASCAL VOC + COCO 2012 | 512x512x3 | 93.29 | 73.44 | / | / | 9.1.0 | [link](./deeplab_v3/ST_pretrainedmodel_public_dataset/pascal_voc_coco_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft.h5)
38-
| deeplab_v3_mobilenetv2_05_fft_per_channel | Tensorflow | PASCAL VOC + COCO 2012 | 512x512x3 | 91.3 | 67.32 | 57.38 | 7.63 | 9.1.0 | [link](./deeplab_v3/ST_pretrainedmodel_public_dataset/pascal_voc_coco_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft_int8.tflite) |
39-
deeplab_v3_mobilenetv2_05_fft_int8_f32_per_channel | Tensorflow | PASCAL VOC + COCO 2012 | 512x512x3 | 92.83 | 71.93 | 55.91 | 6.2 | 9.1.0 | [link](./deeplab_v3/ST_pretrainedmodel_public_dataset/pascal_voc_coco_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft_int8_f32.tflite) |
36+
| deeplabv3_257_int8_per_tensor | TensorFlow | COCO 2017 + PASCAL VOC 2012 | 257x257x3 | 88.66 | 59.06 | 25.7 | 0.86 | 9.1.0 | Available in X-LINUX-AI package [link](https://www.st.com/en/embedded-software/x-linux-ai.html) |
37+
| deeplab_v3_mobilenetv2_05_fft_float32 | Tensorflow | COCO 2017 + PASCAL VOC 2012 | 512x512x3 | 93.29 | 73.44 | / | / | 9.1.0 | [link](./deeplab_v3/ST_pretrainedmodel_public_dataset/coco_2017_pascal_voc_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft.h5)
38+
| deeplab_v3_mobilenetv2_05_fft_per_channel | Tensorflow | COCO 2017 + PASCAL VOC 2012 | 512x512x3 | 91.3 | 67.32 | 57.38 | 7.63 | 9.1.0 | [link](./deeplab_v3/ST_pretrainedmodel_public_dataset/coco_2017_pascal_voc_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft_int8.tflite) |
39+
deeplab_v3_mobilenetv2_05_fft_int8_f32_per_channel | Tensorflow | COCO 2017 + PASCAL VOC 2012 | 512x512x3 | 92.83 | 71.93 | 55.91 | 6.2 | 9.1.0 | [link](./deeplab_v3/ST_pretrainedmodel_public_dataset/coco_2017_pascal_voc_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft_int8_f32.tflite) |
4040
</details>
4141

4242

semantic_segmentation/pretrained_models/deeplab_v3/README.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ For an image resolution of NxM and P classes
5959
To train the deeplab_v3 with backbone MobileNet v2 model with pretrained weights, from scratch or fine-tune it on your own dataset, you need to configure the [user_config.yaml](../../src/user_config.yaml) file following the
6060
[tutorial](../../README.md) under the src section.
6161

62-
As an example, [deeplab_v3_mobilenetv2_05_16_512_fft.yaml](./ST_pretrainedmodel_public_dataset/pascal_voc_coco_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft_config.yaml) file is used to train on PASCAL VOC + COCO 2012 dataset. You can copy its content in the [user_config.yaml](../../src/user_config.yaml) file provided under
62+
As an example, [deeplab_v3_mobilenetv2_05_16_512_fft.yaml](./ST_pretrainedmodel_public_dataset/coco_2017_pascal_voc_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft_config.yaml) file is used to train on COCO 2017 + PASCAL VOC 2012 dataset. You can copy its content in the [user_config.yaml](../../src/user_config.yaml) file provided under
6363
the src section to reproduce the results presented below.
6464

6565
## Deployment
@@ -73,16 +73,16 @@ To deploy your trained model, you need to configure the same [user_config.yaml](
7373
Measures are done with default STM32Cube.AI configuration with enabled input / output allocated option.
7474

7575

76-
### Reference **MPU** inference time based on PASCAL VOC + COCO 2012 segmentation dataset 21 classes (see Accuracy for details on dataset)
76+
### Reference **MPU** inference time based on COCO 2017 + PASCAL VOC 2012 segmentation dataset 21 classes (see Accuracy for details on dataset)
7777
| Model | Dataset | Format | Resolution | Quantization | Board | Execution Engine | Frequency | Inference time (ms) | %NPU | %GPU | %CPU | X-LINUX-AI version | Framework |
7878
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------|--------|------------|----------------|-------------------|------------------|-----------|---------------------|-------|--------|------|--------------------|-----------------------|
79-
| [DeepLabV3 per tensor (no ASPP)](https://www.st.com/en/embedded-software/x-linux-ai.html) | PASCAL VOC + COCO 2012 | Int8 | 257x257x3 | per-tensor | STM32MP257F-DK2 | NPU/GPU | 1500 MHz | 52.75 | 99.2 | 0.80 | 0 | v5.1.0 | OpenVX | | | | | v5.1.0
80-
| [DeepLabV3 per channel](./ST_pretrainedmodel_public_dataset/pascal_voc_coco_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft_int8.tflite) | PASCAL VOC + COCO 2012 | Int8 | 512x512x3 | per-channel ** | STM32MP257F-DK2 | NPU/GPU | 1500 MHz | 806.12 | 8.73| 91.27 | 0 | v5.1.0 | OpenVX |
81-
| [DeepLabV3 mixed precision](./ST_pretrainedmodel_public_dataset/pascal_voc_coco_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft_int8_f32.tflite) | PASCAL VOC + COCO 2012 | Int8 & float32 | 512x512x3 | per-channel ** | STM32MP257F-DK2 | NPU/GPU | 1500 MHz | 894.56 | 7.67 | 92.33 | 0 | v5.1.0 | OpenVX |
79+
| [DeepLabV3 per tensor (no ASPP)](https://www.st.com/en/embedded-software/x-linux-ai.html) | COCO 2017 + PASCAL VOC 2012 | Int8 | 257x257x3 | per-tensor | STM32MP257F-DK2 | NPU/GPU | 1500 MHz | 52.75 | 99.2 | 0.80 | 0 | v5.1.0 | OpenVX | | | | | v5.1.0
80+
| [DeepLabV3 per channel](./ST_pretrainedmodel_public_dataset/coco_2017_pascal_voc_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft_int8.tflite) | COCO 2017 + PASCAL VOC 2012 | Int8 | 512x512x3 | per-channel ** | STM32MP257F-DK2 | NPU/GPU | 1500 MHz | 806.12 | 8.73| 91.27 | 0 | v5.1.0 | OpenVX |
81+
| [DeepLabV3 mixed precision](./ST_pretrainedmodel_public_dataset/coco_2017_pascal_voc_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft_int8_f32.tflite) | COCO 2017 + PASCAL VOC 2012 | Int8 & float32 | 512x512x3 | per-channel ** | STM32MP257F-DK2 | NPU/GPU | 1500 MHz | 894.56 | 7.67 | 92.33 | 0 | v5.1.0 | OpenVX |
8282

8383
** **To get the most out of MP25 NPU hardware acceleration, please use per-tensor quantization**
8484

85-
### Accuracy with PASCAL VOC + COCO 2012
85+
### Accuracy with COCO 2017 + PASCAL VOC 2012
8686

8787
Dataset details: [link](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/), License [Database Contents License (DbCL) v1.0](https://opendatacommons.org/licenses/dbcl/1-0/) , Number of classes: 21, Number of images: 11530
8888
Please note, that the following accuracies are evaluated on Pascal VOC 2012 validation set (val.txt), and with a preprocessing resize with interpolation method 'bilinear'.
@@ -91,17 +91,17 @@ Moreover, IoU are averaged on all classes including background.
9191
| Model Description | Resolution | Format | Accuracy | Averaged IoU |
9292
|--------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|------------|----------|--------------|
9393
| [DeepLabV3 per tensor (no ASPP)](https://www.st.com/en/embedded-software/x-linux-ai.html) | 257x257x3 | Int8 | 88.6% | 59.33% |
94-
| [DeepLabV3 float precision](./ST_pretrainedmodel_public_dataset/pascal_voc_coco_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft.h5) | 512x512x3 | Float | 93.29% | 73.44% |
95-
| [DeepLabV3 per channel](./ST_pretrainedmodel_public_dataset/pascal_voc_coco_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft_int8.tflite) | 512x512x3 | Int8 | 91.3% | 67.32% |
96-
| [DeepLabV3 mixed precision](./ST_pretrainedmodel_public_dataset/pascal_voc_coco_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft_int8_f32.tflite) | 512x512x3 | Int8/Float | 92.83% | 71.93% |
94+
| [DeepLabV3 float precision](./ST_pretrainedmodel_public_dataset/coco_2017_pascal_voc_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft.h5) | 512x512x3 | Float | 93.29% | 73.44% |
95+
| [DeepLabV3 per channel](./ST_pretrainedmodel_public_dataset/coco_2017_pascal_voc_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft_int8.tflite) | 512x512x3 | Int8 | 91.3% | 67.32% |
96+
| [DeepLabV3 mixed precision](./ST_pretrainedmodel_public_dataset/coco_2017_pascal_voc_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft_int8_f32.tflite) | 512x512x3 | Int8/Float | 92.83% | 71.93% |
9797

9898
## Retraining and code generation
9999

100100
- **DeepLabV3 per tensor**:
101101
This model, which does not include ASPP (Atrous Spatial Pyramid Pooling), was downloaded from the TensorFlow DeepLabV3 page on[Kaggle](https://www.kaggle.com/models/tensorflow/deeplabv3/).
102102

103103
- **DeepLabV3 float precision**:
104-
This model is the result of using the [deeplab_v3_mobilenetv2_05_16_512_fft.yaml](./ST_pretrainedmodel_public_dataset/pascal_voc_coco_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft_config.yaml) configuration file to train the model on the PASCAL VOC + COCO 2012 dataset.
104+
This model is the result of using the [deeplab_v3_mobilenetv2_05_16_512_fft.yaml](./ST_pretrainedmodel_public_dataset/coco_2017_pascal_voc_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft_config.yaml) configuration file to train the model on the COCO 2017 + PASCAL VOC 2012 dataset.
105105

106106
- **DeepLabV3 Per channel**:
107107
This model is quantized `per channel` version of DeepLabV3 float precision. It is generated using the quantization service with the [the quantization_config.yaml](../../src/config_file_examples/quantization_config.yaml) configuration file.
Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -13,13 +13,13 @@ dataset:
1313
"car", "cat", "chair", "cow", "dining table", "dog", "horse", "motorbike",
1414
"person", "potted plant", "sheep", "sofa", "train", "tv/monitor"]
1515

16-
training_path: ../datasets/VOC2012_COCO_train_val/JPEGImages
17-
training_masks_path: ../datasets/VOC2012_COCO_train_val/SegmentationClassAug
18-
training_files_path: ../datasets/VOC2012_COCO_train_val/ImageSets/Segmentation/trainaug.txt
16+
training_path: ../datasets/COCO2017_VOC2012/JPEGImages
17+
training_masks_path: ../datasets/COCO2017_VOC2012/SegmentationClassAug
18+
training_files_path: ../datasets/COCO2017_VOC2012/ImageSets/Segmentation/trainaug.txt
1919

20-
validation_path: ../datasets/VOC2012_COCO_train_val/JPEGImages
21-
validation_masks_path: ../datasets/VOC2012_COCO_train_val/SegmentationClassAug
22-
validation_files_path: ../datasets/VOC2012_COCO_train_val/ImageSets/Segmentation/val.txt
20+
validation_path: ../datasets/COCO2017_VOC2012/JPEGImages
21+
validation_masks_path: ../datasets/COCO2017_VOC2012/SegmentationClassAug
22+
validation_files_path: ../datasets/COCO2017_VOC2012/ImageSets/Segmentation/val.txt
2323
validation_split:
2424

2525
test_path:

semantic_segmentation/src/benchmarking/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ In particular, `operation_mode` should be set to evaluation and the `benchmarkin
1616

1717
```yaml
1818
general:
19-
model_path: ../pretrained_models/deeplab_v3/ST_pretrainedmodel_public_dataset/pascal_voc_coco_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft_int8.tflite
19+
model_path: ../pretrained_models/deeplab_v3/ST_pretrainedmodel_public_dataset/coco_2017_pascal_voc_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft_int8.tflite
2020

2121
operation_mode: benchmarking
2222
```

semantic_segmentation/src/config_file_examples/benchmarking_config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
general:
2-
model_path: ../pretrained_models/deeplab_v3/ST_pretrainedmodel_public_dataset/pascal_voc_coco_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft_int8.tflite
2+
model_path: ../pretrained_models/deeplab_v3/ST_pretrainedmodel_public_dataset/coco_2017_pascal_voc_2012/deeplab_v3_mobilenetv2_05_16_512_fft/deeplab_v3_mobilenetv2_05_16_512_fft_int8.tflite
33

44
operation_mode: benchmarking
55

0 commit comments

Comments
 (0)