Skip to content

Commit fb06762

Browse files
committed
Release AI-ModelZoo-2.0.1:
- Disclosed some ST object detection models: `st_yolo_lc_v1` and `ssd_mobilenet_v2_fpnlite` with various resolutions. - Disclosed some ST image classification models: `st_efficientnet_lc_v1`, `st_fdmobilenet_v1`, `st_resnet_8_hybrid_v1`, and `st_resnet_8_hybrid_v2` in different resolutions. - Fixed minor bugs and updated README documentation. Signed-off-by: khaoula boutiche <[email protected]>
1 parent e8d5d6d commit fb06762

File tree

164 files changed

+4790
-6397
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

164 files changed

+4790
-6397
lines changed

README.md

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,13 @@ The performances on reference STM32 MCU and MPU are provided for float and quant
2020
This project is organized by application, for each application you will have a step by step guide that will indicate how
2121
to train and deploy the models.
2222

23+
## What's new in release 2.0:
24+
* An aligned and `uniform architecture` for all the use case
25+
* A modular design to run different operation modes (training, benchmarking, evaluation, deployment, quantization) independently or with an option of chaining multiple modes in a single launch.
26+
* A simple and `single entry point` to the code : a .yaml configuration file to configure all the needed services.
27+
* Support of the `Bring Your Own Model (BYOM)` feature to allow the user (re-)training his own model. Example is provided [here](./image_classification/src/training/README.md#51-training-your-own-model).
28+
* Support of the `Bring Your Own Data (BYOD)` feature to allow the user finetuning some pretrained models with his own datasets. Example is provided [here](./image_classification/src/training/README.md#23-dataset-specification).
29+
2330

2431
<div align="center" style="margin-top: 80px; padding: 20px 0;">
2532
<p align="center">
@@ -30,16 +37,17 @@ to train and deploy the models.
3037
</div>
3138

3239
## Available use-cases
33-
34-
* [Image classification](image_classification)
40+
>[!TIP]
41+
> For all use-cases below, quick and easy examples are provided and can be executed for a fast ramp up (click on use cases links below)
42+
* [Image classification (IC)](image_classification)
3543
* Models: EfficientNet, MobileNet v1, MobileNet v2, Resnet v1 including with hybrid quantization,
3644
SqueezeNet v1.1, STMNIST.
3745
* Deployment: getting started application
3846
* On [STM32H747I-DISCO](stm32ai_application_code/image_classification/Application/STM32H747I-DISCO) with
3947
B-CAMS-OMV camera daughter board.
4048
* On [NUCLEO-H743ZI2](stm32ai_application_code/image_classification/Application/NUCLEO-H743ZI2) with B-CAMS-OMV
4149
camera daughter board, webcam or Arducam Mega 5MP as input and USB display or SPI display as output.
42-
* [Object detection](object_detection)
50+
* [Object detection (OD)](object_detection)
4351
* Models: ST SSD MobileNet v1, Tiny YOLO v2, SSD MobileNet v2 fpn lite, ST Yolo LC v1.
4452
* Deployment: getting started application
4553
* On [STM32H747I-DISCO](stm32ai_application_code/object_detection/Application/STM32H747I-DISCO) with B-CAMS-OMV
@@ -52,7 +60,7 @@ to train and deploy the models.
5260
* Models: Yamnet, MiniResnet, MiniResnet v2.
5361
* Deployment: getting started application
5462
* On [B-U585I-IOT02A](stm32ai_application_code) using RTOS, ThreadX or FreeRTOS.
55-
* [Hand posture recognition](hand_posture)
63+
* [Hand posture recognition (HPR)](hand_posture)
5664
* The hand posture use case is based on the ST multi-zone Time-of-Flight sensors: VL53L5CX, VL53L7CX, VL53L8CX. The
5765
goal of this use case is to recognize static hand posture such as a like, dislike or love sign done with user hand
5866
in front of the sensor. We are providing a complete workflow from data acquisition to model training, then
@@ -79,8 +87,12 @@ to train and deploy the models.
7987
* [stm32ai-tao](https://github.com/STMicroelectronics/stm32ai-tao): this GitHub repository provides Python scripts and
8088
Jupyter notebooks to manage a complete life cycle of a model from training, to compression, optimization and
8189
benchmarking using **NVIDIA TAO Toolkit** and STM32Cube.AI Developer Cloud.
90+
* [stm32ai-nota](https://github.com/STMicroelectronics/stm32ai-nota): this GitHub repository contains Jupyter notebooks that demonstrate how to use **NetsPresso** to prune pre-trained deep learning models from the model zoo and fine-tune, quantize and benchmark them by using STM32Cube.AI Developer Cloud for your specific use case.
8291

8392
## Before you start
93+
For more in depth guide on installing and setting up the model zoo and its requirement on your PC, specially in the
94+
cases when you are running behind the proxy in corporate setup, follow the detailed wiki article
95+
on [How to install STM32 model zoo](https://wiki.st.com/stm32mcu/index.php?title=AI:How_to_install_STM32_model_zoo).
8496

8597
* Create an account on myST and then sign in to [STM32Cube.AI Developer Cloud](https://stm32ai-cs.st.com/home) to be
8698
able access the service.
@@ -146,10 +158,6 @@ git clone https://github.com/STMicroelectronics/stm32ai-modelzoo.git
146158
pip install -r requirements.txt
147159
```
148160
149-
For more in depth guide on installing and setting up the model zoo and its requirement on your PC, specially in the
150-
cases when you are running behind the proxy in corporate setup, follow the detailed wiki article
151-
on [How to install STM32 model zoo](https://wiki.st.com/stm32mcu/index.php?title=AI:How_to_install_STM32_model_zoo).
152-
153161
## Jump start with Colab
154162
155163
In [tutorials/notebooks](tutorials/notebooks/README.md) you will find a jupyter notebook that can be easily deployed on

audio_event_detection/README.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,12 @@
22

33

44
## Directory components:
5-
65
* [datasets](datasets/README.md) placeholder for the audio event detection datasets.
76
* [deployment](deployment/README.md) contains the necessary files to deploy models on an STM32 board
87
* [pretrained_models](pretrained_models/README.md) a collection of optimized pretrained models on different audio datasets.
98
* [src](src/README.md) contains tools to train, evaluate, benchmark and quantize your model on your STM32 target.
109

1110
## Tutorials and documentation:
12-
1311
* [Complete AED model zoo and configuration file documentation](src/README.md)
1412
* [A short tutorial on training a model using the model zoo](src/training/README.md)
1513
* [A short tutorial on quantizing a model using the model zoo](src/quantization/README.md)

audio_event_detection/pretrained_models/yamnet/ST_pretrainedmodel_public_dataset/fsd50k/yamnet_256_64x96_tl/with_unknown_class/yamnet_256_64x96_tl_config.yaml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,9 @@ dataset_specific:
4747
csv_folder: ../datasets/FSD50K/FSD50K.ground_truth
4848
dev_audio_folder: ../datasets/FSD50K/FSD50K.dev_audio
4949
eval_audio_folder: ../datasets/FSD50K/FSD50K.eval_audio
50-
audioset_ontology_path: preprocessing/dataset_utils/fsd50k/audioset_ontology.json
50+
# Change this next line to the ontology path on your machine.
51+
# Download the ontology at https://github.com/audioset/ontology
52+
audioset_ontology_path: preprocessing/dataset_utils/fsd50k/audioset_ontology.json
5153
only_keep_monolabel: True
5254

5355
preprocessing:

audio_event_detection/pretrained_models/yamnet/ST_pretrainedmodel_public_dataset/fsd50k/yamnet_256_64x96_tl/without_unknown_class/yamnet_256_64x96_tl_config.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,8 @@ dataset_specific:
4747
csv_folder: ../datasets/FSD50K/FSD50K.ground_truth
4848
dev_audio_folder: ../datasets/FSD50K/FSD50K.dev_audio
4949
eval_audio_folder: ../datasets/FSD50K/FSD50K.eval_audio
50+
# Change this next line to the ontology path on your machine.
51+
# Download the ontology at https://github.com/audioset/ontology
5052
audioset_ontology_path: preprocessing/dataset_utils/fsd50k/audioset_ontology.json
5153
only_keep_monolabel: True
5254

audio_event_detection/src/README.md

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -171,7 +171,7 @@ Environment variables can be used to avoid hardcoding in the configuration file
171171
172172
#### <a id="3-2">3.2 Operation mode</a>
173173
174-
The `operation_mode` top-level attribute specifies the operations you want to executed. This may be single operation or a set of chained operations.
174+
The `operation_mode` top-level attribute specifies the operations you want to execute. This may be single operation or a set of chained operations.
175175

176176
The different values of the `operation_mode` attribute and the corresponding operations are described in the table below. In the names of the chain modes, 't' stands for training, 'e' for evaluation, 'q' for quantization, 'b' for benchmark and 'd' for deployment.
177177

@@ -319,7 +319,7 @@ The model zoo provides support for some publicly available datasets. However, su
319319

320320
Currently, only ESC-10 (which does not require any dataset-specific parameters) and FSD50K are supported by the model zoo. Thus, this section only covers parameters specific to FSD50K.
321321

322-
For more details on how to train a model using FSD50K, please consult section <a href="#7">7. Training a model on FSD50K </a>
322+
For more details on how to train a model using FSD50K, please consult section <a href="#8">8. Training a model on FSD50K </a>
323323

324324

325325
```yaml
@@ -331,14 +331,16 @@ dataset_specific:
331331
csv_folder: ../datasets/FSD50K/FSD50K.ground_truth
332332
dev_audio_folder: ../datasets/FSD50K/FSD50K.dev_audio
333333
eval_audio_folder: ../datasets/FSD50K/FSD50K.eval_audio
334+
# Change this next line to the ontology path on your machine.
335+
# Download the ontology at https://github.com/audioset/ontology
334336
audioset_ontology_path: preprocessing/dataset_utils/fsd50k/audioset_ontology.json
335337
only_keep_monolabel: True
336338
```
337339

338340
- `csv_folder` : *string*, Folder where the dev and eval csv files are located. The default name for this folder in the archives downloaded from Zenodo is `FSD50K.ground_truth`
339341
- `dev_audio_folder` : *string*, Folder where the dev audio files are located. The default name for this folder in the archives downloaded from Zenodo is `FSD50K.dev_audio`
340342
- `eval_audio_folder` : *string*, Folder where the eval audio files are located. The default name for this folder in the archives downloaded from Zenodo is `FSD50K.eval_audio`
341-
- `audioset_ontology_path` : *string*, Path to the audioset ontology JSON file. The file is provided in the model zoo [here](./preprocessing/dataset_utils/fsd50k/audioset_ontology.json), but you can also download it from https://github.com/audioset/ontology/blob/master/ontology.json
343+
- `audioset_ontology_path` : *string*, Path to the audioset ontology JSON file. Due to licensing issues, the file is NOT provided in the model zoo, but you can also download it from https://github.com/audioset/ontology/blob/master/ontology.json
342344
- `only_keep_monolabel` : *boolean*, If set to True, discard all multi-label samples. This is a comparatively small proportion of all samples.
343345

344346
#### <a id="3-7">3.7 Audio temporal domain preprocessing</a>
@@ -986,11 +988,16 @@ After extraction you should end up with the following folders :
986988

987989
Strictly speaking, `FSD50K.metadata` and `FSD50K.doc` are unnecessary, so they can be deleted.
988990

991+
Next, download the audioset ontology JSON file here : https://github.com/audioset/ontology/blob/master/ontology.json
992+
993+
Due to licensing concerns, we cannot provide this file directly in the zoo, and you must download it yourself.
994+
989995
**Set up the dataset-specific parameters**
990996
First, set `dataset.name` to `fsd50k` in the configuration file. See section<a href="3-5">3.5 Datasets</a> for more details.
991997

992998
You will need to set some dataset-specific parameters in the configuration file.
993999
See <a href="3-6">3.6 Dataset-specific parameters </a> for a detailed description of each parameter.
1000+
Don't forget to set the `audioset_ontology_path` argument to the path where you downloaded the audioset ontology JSON file.
9941001

9951002
**NOTE** The regular `training_audio_path`, `training_csv_path`, `validation_audio_path`, `validation_csv_path`, `validation_split` are unused when using FSD50K. Instead, the dev set is used as the training set, and the eval set as the validation set.
9961003

0 commit comments

Comments
 (0)