A collection of optimized Deep Neural Network (DNN) models for Audio Tasks on TI EdgeAI processors. Models are converted from PyTorch and TensorFlow into embedded-friendly formats optimized for TI SoCs.
Notice: The models in this repository are being made available for experimentation and development - they are not meant for deployment in production.
- Processors: AM62A
- TIDL Version: 11_01_06_00
On the Linux command line on the target (AM62A)
mkdir -p ~/tidl && cd ~/tidl
git clone https://github.com/TexasInstruments-Sandbox/audioai-modelzoo.git
cd audioai-modelzoo./download_models.sh -y
./download_artifacts.sh -yBoth scripts provide interactive menus to select and download models.
This repository uses a two-stage Docker build process (see docker folder). The base image contains all dependencies and is pre-built and available from GitHub Container Registry. The TI-specific image adds processor-specific libraries on top of the base.
Pull the pre-built base image and build the TI image:
docker pull ghcr.io/texasinstruments-sandbox/audioai-base:11.1.0
docker tag ghcr.io/texasinstruments-sandbox/audioai-base:11.1.0 audioai-base:11.1.0
cd docker
./docker_build_ti.shIf you want to build the base image from scratch instead of pulling it, run ./docker_build_base.sh before building the TI image.
Launch the container
~/tidl/audioai-modelzoo/docker/docker_run.shInside container, start Jupyter Lab
./jupyter_lab.shThe script will display a highlighted access URL. Open it in your browser to access Jupyter Lab with three inference notebooks pre-loaded in tabs.
Models are located in the models folder.
Inference in Jupyter Notebook: inference/gtcrn_se/gtcrn_inference.ipynb
Inference in Jupyter Notebook: inference/vggish11_sc/vggish_inference.ipynb
Python script version: Below should be run inside the Docker container.
cd ~/tidl/audioai-modelzoo/inference/vggish11_sc
python3 vggish_infer_audio.py --audio-file sample_wav/139951-9-0-9.wavInference in Jupyter Notebook: inference/yamnet_sc/yamnet_inference.ipynb
Python script version: Below should be run inside the Docker container.
cd ~/tidl/audioai-modelzoo/inference/yamnet_sc
python3 yamnet_infer_audio.py --audio-file samples/miaow_16k.wav| Model | Input Audio (sec) | Inference Time (ms) | Real-Time Factor |
|---|---|---|---|
| GTCRN (FP32) | 9.77 | 679.90 | 0.070 |
| VGGish11 (INT8) | 4.00 | 8.88 | 0.002 |
| YAMNet (INT8) | 6.73 (7 patches) | 17.53 total | 0.003 |
Note: Real-Time Factor (RTF) = Processing Time / Audio Duration. RTF < 1.0 means faster than real-time. Performance metrics may vary depending on system conditions.
