AudioAI-ModelZoo

A collection of optimized Deep Neural Network (DNN) models for Audio Tasks on TI EdgeAI processors. Models are converted from PyTorch and TensorFlow into embedded-friendly formats optimized for TI SoCs.

Notice: The models in this repository are being made available for experimentation and development - they are not meant for deployment in production.

System Requirements

Processors: AM62A
TIDL Version: 11_01_06_00

Quick Start

Git pull the project

On the Linux command line on the target (AM62A)

mkdir -p ~/tidl && cd ~/tidl
git clone https://github.com/TexasInstruments-Sandbox/audioai-modelzoo.git
cd audioai-modelzoo

Download Models and Model Artifacts

./download_models.sh -y
./download_artifacts.sh -y

Both scripts provide interactive menus to select and download models.

Docker Image Setup

This repository uses a two-stage Docker build process (see docker folder). The base image contains all dependencies and is pre-built and available from GitHub Container Registry. The TI-specific image adds processor-specific libraries on top of the base.

Pull the pre-built base image and build the TI image:

docker pull ghcr.io/texasinstruments-sandbox/audioai-base:11.1.0
docker tag ghcr.io/texasinstruments-sandbox/audioai-base:11.1.0 audioai-base:11.1.0
cd docker
./docker_build_ti.sh

If you want to build the base image from scratch instead of pulling it, run ./docker_build_base.sh before building the TI image.

Start Jupyter Server

Launch the container

~/tidl/audioai-modelzoo/docker/docker_run.sh

Inside container, start Jupyter Lab

./jupyter_lab.sh

The script will display a highlighted access URL. Open it in your browser to access Jupyter Lab with three inference notebooks pre-loaded in tabs.

Pre-Trained Models

Models are located in the models folder.

Speech Enhancement (Audio-to-Audio)

GTCRN

Inference in Jupyter Notebook: inference/gtcrn_se/gtcrn_inference.ipynb

Sound Classification (Audio-to-Class)

VGGish11

Inference in Jupyter Notebook: inference/vggish11_sc/vggish_inference.ipynb

Python script version: Below should be run inside the Docker container.

cd ~/tidl/audioai-modelzoo/inference/vggish11_sc
python3 vggish_infer_audio.py --audio-file sample_wav/139951-9-0-9.wav

YAMNet

Inference in Jupyter Notebook: inference/yamnet_sc/yamnet_inference.ipynb

Python script version: Below should be run inside the Docker container.

cd ~/tidl/audioai-modelzoo/inference/yamnet_sc
python3 yamnet_infer_audio.py --audio-file samples/miaow_16k.wav

Performance Benchmarks

Model	Input Audio (sec)	Inference Time (ms)	Real-Time Factor
GTCRN (FP32)	9.77	679.90	0.070
VGGish11 (INT8)	4.00	8.88	0.002
YAMNet (INT8)	6.73 (7 patches)	17.53 total	0.003

Note: Real-Time Factor (RTF) = Processing Time / Audio Duration. RTF < 1.0 means faster than real-time. Performance metrics may vary depending on system conditions.

Model References

GTCRN: https://github.com/Xiaobin-Rong/gtcrn
VGGish: https://github.com/tensorflow/models/tree/master/research/audioset/vggish
YAMNet: https://github.com/tensorflow/models/tree/master/research/audioset/yamnet, https://github.com/w-hc/torch_audioset

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github/workflows		.github/workflows
docker		docker
docs		docs
inference		inference
model_artifacts		model_artifacts
models		models
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE_ADDITIONAL		LICENSE_ADDITIONAL
README.md		README.md
VERSION		VERSION
download_artifacts.sh		download_artifacts.sh
download_models.sh		download_models.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AudioAI-ModelZoo

System Requirements

Quick Start

Git pull the project

Download Models and Model Artifacts

Docker Image Setup

Start Jupyter Server

Pre-Trained Models

Speech Enhancement (Audio-to-Audio)

GTCRN

Sound Classification (Audio-to-Class)

VGGish11

YAMNet

Performance Benchmarks

Model References

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

TexasInstruments-Sandbox/audioai-modelzoo

Folders and files

Latest commit

History

Repository files navigation

AudioAI-ModelZoo

System Requirements

Quick Start

Git pull the project

Download Models and Model Artifacts

Docker Image Setup

Start Jupyter Server

Pre-Trained Models

Speech Enhancement (Audio-to-Audio)

GTCRN

Sound Classification (Audio-to-Class)

VGGish11

YAMNet

Performance Benchmarks

Model References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages