STMicroelectronics
diff --git a/‎README.md‎
Lines changed: 16 additions & 8 deletions b/‎README.md‎
Lines changed: 16 additions & 8 deletions
diff --git a/‎audio_event_detection/README.md‎
Lines changed: 0 additions & 2 deletions b/‎audio_event_detection/README.md‎
Lines changed: 0 additions & 2 deletions
diff --git a/‎audio_event_detection/pretrained_models/yamnet/ST_pretrainedmodel_public_dataset/fsd50k/yamnet_256_64x96_tl/with_unknown_class/yamnet_256_64x96_tl_config.yaml‎
Lines changed: 3 additions & 1 deletion b/‎audio_event_detection/pretrained_models/yamnet/ST_pretrainedmodel_public_dataset/fsd50k/yamnet_256_64x96_tl/with_unknown_class/yamnet_256_64x96_tl_config.yaml‎
Lines changed: 3 additions & 1 deletion
diff --git a/‎audio_event_detection/pretrained_models/yamnet/ST_pretrainedmodel_public_dataset/fsd50k/yamnet_256_64x96_tl/without_unknown_class/yamnet_256_64x96_tl_config.yaml‎
Lines changed: 2 additions & 0 deletions b/‎audio_event_detection/pretrained_models/yamnet/ST_pretrainedmodel_public_dataset/fsd50k/yamnet_256_64x96_tl/without_unknown_class/yamnet_256_64x96_tl_config.yaml‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎audio_event_detection/src/README.md‎
Lines changed: 10 additions & 3 deletions b/‎audio_event_detection/src/README.md‎
Lines changed: 10 additions & 3 deletions
@@ -20,6 +20,13 @@ The performances on reference STM32 MCU and MPU are provided for float and quant
 This project is organized by application, for each application you will have a step by step guide that will indicate how
 to train and deploy the models.
 
+## What's new in release 2.0:
+* An aligned and `uniform architecture` for all the use case
+* A modular design to run different operation modes (training, benchmarking, evaluation, deployment, quantization) independently or with an option of chaining multiple modes in a single launch.
+* A simple and `single entry point` to the code : a .yaml configuration file to configure all the needed services.
+* Support of the `Bring Your Own Model (BYOM)` feature to allow the user (re-)training his own model. Example is provided [here](./image_classification/src/training/README.md#51-training-your-own-model).
+* Support of the `Bring Your Own Data (BYOD)` feature to allow the user finetuning some pretrained models with his own datasets. Example is provided [here](./image_classification/src/training/README.md#23-dataset-specification).
+
 
 <div align="center" style="margin-top: 80px; padding: 20px 0;">
     <p align="center">
@@ -30,16 +37,17 @@ to train and deploy the models.
 </div>
 
 ## Available use-cases
-
-* [Image classification](image_classification)
+>[!TIP]
+> For all use-cases below, quick and easy examples are provided and can be executed for a fast ramp up (click on use cases links below)
+* [Image classification (IC)](image_classification)
     * Models: EfficientNet, MobileNet v1, MobileNet v2, Resnet v1 including with hybrid quantization, 
       SqueezeNet v1.1, STMNIST.
     * Deployment: getting started application
         * On [STM32H747I-DISCO](stm32ai_application_code/image_classification/Application/STM32H747I-DISCO) with
           B-CAMS-OMV camera daughter board.
         * On [NUCLEO-H743ZI2](stm32ai_application_code/image_classification/Application/NUCLEO-H743ZI2) with B-CAMS-OMV
           camera daughter board, webcam or Arducam Mega 5MP as input and USB display or SPI display as output.
-* [Object detection](object_detection)
+* [Object detection (OD)](object_detection)
     * Models:  ST SSD MobileNet v1, Tiny YOLO v2, SSD MobileNet v2 fpn lite, ST Yolo LC v1.
     * Deployment: getting started application
         * On [STM32H747I-DISCO](stm32ai_application_code/object_detection/Application/STM32H747I-DISCO) with B-CAMS-OMV
@@ -52,7 +60,7 @@ to train and deploy the models.
     * Models: Yamnet, MiniResnet, MiniResnet v2.
     * Deployment: getting started application
         * On [B-U585I-IOT02A](stm32ai_application_code) using RTOS, ThreadX or FreeRTOS.
-* [Hand posture recognition](hand_posture)
+* [Hand posture recognition (HPR)](hand_posture)
     * The hand posture use case is based on the ST multi-zone Time-of-Flight sensors: VL53L5CX, VL53L7CX, VL53L8CX. The
       goal of this use case is to recognize static hand posture such as a like, dislike or love sign done with user hand
       in front of the sensor. We are providing a complete workflow from data acquisition to model training, then
@@ -79,8 +87,12 @@ to train and deploy the models.
 * [stm32ai-tao](https://github.com/STMicroelectronics/stm32ai-tao): this GitHub repository provides Python scripts and
   Jupyter notebooks to manage a complete life cycle of a model from training, to compression, optimization and
   benchmarking using **NVIDIA TAO Toolkit** and STM32Cube.AI Developer Cloud.
+* [stm32ai-nota](https://github.com/STMicroelectronics/stm32ai-nota): this GitHub repository contains Jupyter notebooks that demonstrate how to use **NetsPresso** to prune pre-trained deep learning models from the model zoo and fine-tune, quantize and benchmark them by using STM32Cube.AI Developer Cloud for your specific use case. 
 
 ## Before you start
+For more in depth guide on installing and setting up the model zoo and its requirement on your PC, specially in the
+cases when you are running behind the proxy in corporate setup, follow the detailed wiki article
+on [How to install STM32 model zoo](https://wiki.st.com/stm32mcu/index.php?title=AI:How_to_install_STM32_model_zoo).
 
 * Create an account on myST and then sign in to [STM32Cube.AI Developer Cloud](https://stm32ai-cs.st.com/home) to be
   able access the service.
@@ -146,10 +158,6 @@ git clone https://github.com/STMicroelectronics/stm32ai-modelzoo.git
 pip install -r requirements.txt
 ```
 
-For more in depth guide on installing and setting up the model zoo and its requirement on your PC, specially in the
-cases when you are running behind the proxy in corporate setup, follow the detailed wiki article
-on [How to install STM32 model zoo](https://wiki.st.com/stm32mcu/index.php?title=AI:How_to_install_STM32_model_zoo).
-
 ## Jump start with Colab
 
 In [tutorials/notebooks](tutorials/notebooks/README.md) you will find a jupyter notebook that can be easily deployed on
 
@@ -2,14 +2,12 @@
 
 
 ## Directory components:
-
 * [datasets](datasets/README.md) placeholder for the audio event detection datasets.
 * [deployment](deployment/README.md) contains the necessary files to deploy models on an STM32 board
 * [pretrained_models](pretrained_models/README.md) a collection of optimized pretrained models on different audio datasets.
 * [src](src/README.md) contains tools to train, evaluate, benchmark and quantize your model on your STM32 target.
 
 ## Tutorials and documentation: 
-
 * [Complete AED model zoo and configuration file documentation](src/README.md)
 * [A short tutorial on training a model using the model zoo](src/training/README.md)
 * [A short tutorial on quantizing a model using the model zoo](src/quantization/README.md)
 
@@ -47,7 +47,9 @@ dataset_specific:
     csv_folder: ../datasets/FSD50K/FSD50K.ground_truth
     dev_audio_folder: ../datasets/FSD50K/FSD50K.dev_audio
     eval_audio_folder: ../datasets/FSD50K/FSD50K.eval_audio
-    audioset_ontology_path: preprocessing/dataset_utils/fsd50k/audioset_ontology.json
+    # Change this next line to the ontology path on your machine. 
+    # Download the ontology at https://github.com/audioset/ontology
+    audioset_ontology_path: preprocessing/dataset_utils/fsd50k/audioset_ontology.json 
     only_keep_monolabel: True
 
 preprocessing:
 
@@ -47,6 +47,8 @@ dataset_specific:
     csv_folder: ../datasets/FSD50K/FSD50K.ground_truth
     dev_audio_folder: ../datasets/FSD50K/FSD50K.dev_audio
     eval_audio_folder: ../datasets/FSD50K/FSD50K.eval_audio
+    # Change this next line to the ontology path on your machine. 
+    # Download the ontology at https://github.com/audioset/ontology
     audioset_ontology_path: preprocessing/dataset_utils/fsd50k/audioset_ontology.json
     only_keep_monolabel: True
 
 
@@ -171,7 +171,7 @@ Environment variables can be used to avoid hardcoding in the configuration file
 
 #### <a id="3-2">3.2 Operation mode</a>
 
-The `operation_mode` top-level attribute specifies the operations you want to executed. This may be single operation or a set of chained operations.
+The `operation_mode` top-level attribute specifies the operations you want to execute. This may be single operation or a set of chained operations.
 
 The different values of the `operation_mode` attribute and the corresponding operations are described in the table below. In the names of the chain modes, 't' stands for training, 'e' for evaluation, 'q' for quantization, 'b' for benchmark and 'd' for deployment.
 
@@ -319,7 +319,7 @@ The model zoo provides support for some publicly available datasets. However, su
 
 Currently, only ESC-10 (which does not require any dataset-specific parameters) and FSD50K are supported by the model zoo. Thus, this section only covers parameters specific to FSD50K.
 
-For more details on how to train a model using FSD50K, please consult section <a href="#7">7. Training a model on FSD50K </a>
+For more details on how to train a model using FSD50K, please consult section <a href="#8">8. Training a model on FSD50K </a>
 
 
 ```yaml
@@ -331,14 +331,16 @@ dataset_specific:
     csv_folder: ../datasets/FSD50K/FSD50K.ground_truth
     dev_audio_folder: ../datasets/FSD50K/FSD50K.dev_audio
     eval_audio_folder: ../datasets/FSD50K/FSD50K.eval_audio
+    # Change this next line to the ontology path on your machine. 
+    # Download the ontology at https://github.com/audioset/ontology
     audioset_ontology_path: preprocessing/dataset_utils/fsd50k/audioset_ontology.json
     only_keep_monolabel: True
 ```
 
 - `csv_folder` : *string*, Folder where the dev and eval csv files are located. The default name for this folder in the archives downloaded from Zenodo is `FSD50K.ground_truth`
 - `dev_audio_folder` : *string*, Folder where the dev audio files are located. The default name for this folder in the archives downloaded from Zenodo is `FSD50K.dev_audio`
 - `eval_audio_folder` : *string*, Folder where the eval audio files are located. The default name for this folder in the archives downloaded from Zenodo is `FSD50K.eval_audio`
-- `audioset_ontology_path` : *string*, Path to the audioset ontology JSON file. The file is provided in the model zoo [here](./preprocessing/dataset_utils/fsd50k/audioset_ontology.json), but you can also download it from https://github.com/audioset/ontology/blob/master/ontology.json
+- `audioset_ontology_path` : *string*, Path to the audioset ontology JSON file. Due to licensing issues, the file is NOT provided in the model zoo, but you can also download it from https://github.com/audioset/ontology/blob/master/ontology.json
 - `only_keep_monolabel` : *boolean*, If set to True, discard all multi-label samples. This is a comparatively small proportion of all samples. 
 
 #### <a id="3-7">3.7 Audio temporal domain preprocessing</a>
@@ -986,11 +988,16 @@ After extraction you should end up with the following folders :
 
 Strictly speaking, `FSD50K.metadata` and `FSD50K.doc` are unnecessary, so they can be deleted.
 
+Next, download the audioset ontology JSON file here : https://github.com/audioset/ontology/blob/master/ontology.json
+
+Due to licensing concerns, we cannot provide this file directly in the zoo, and you must download it yourself.
+
 **Set up the dataset-specific parameters**
 First, set `dataset.name` to `fsd50k` in the configuration file. See section<a href="3-5">3.5 Datasets</a> for more details.
 
 You will need to set some dataset-specific parameters in the configuration file.
 See <a href="3-6">3.6 Dataset-specific parameters </a> for a detailed description of each parameter.
+Don't forget to set the `audioset_ontology_path` argument to the path where you downloaded the audioset ontology JSON file.
 
 **NOTE** The regular `training_audio_path`, `training_csv_path`, `validation_audio_path`, `validation_csv_path`, `validation_split` are unused when using FSD50K. Instead, the dev set is used as the training set, and the eval set as the validation set.