Skip to content

[feat] Add performance and separation quality benchmarks#9

Open
dhunstack wants to merge 2 commits intomixxxdj:mainfrom
dhunstack:benchmark
Open

[feat] Add performance and separation quality benchmarks#9
dhunstack wants to merge 2 commits intomixxxdj:mainfrom
dhunstack:benchmark

Conversation

@dhunstack
Copy link
Collaborator

@dhunstack dhunstack commented Oct 27, 2025

Add benchmarking for PyTorch Demucs and C++ ONNX scripts -

  • Separation quality using SI-SDR
  • Performance metrics on CPU
  • Performance metrics on GPU

CPU Performance Comparison

Processing Speed

Metric PyTorch Model C++ ONNX Model Improvement
Total Processing Time 5,380.35 sec 4,415.30 sec 17.94% faster
Processing Time for 1 min input 25.89 sec 21.24 sec 17.94% faster

Audio Quality (SI-SDR in dB)

Stem PyTorch Model (dB) C++ ONNX Model (dB) Difference
drums 9.50 9.48 -0.02
bass 7.87 7.80 -0.07
other 4.75 4.66 -0.09
vocals 7.90 7.82 -0.08
Overall 7.50 7.44 -0.06

Summary Statistics

Dataset PyTorch Model C++ ONNX Model
Total Tracks 50 50
Total Audio Duration 12,470.98 sec (~3.46 hours) 12,470.98 sec (~3.46 hours)
Model Identifier htdemucs ../onnx-models/htdemucs.ort

Key Findings

  • Performance: The C++ ONNX model is 17.94% faster than the PyTorch implementation
  • Quality: Audio separation quality is nearly identical, with only minor differences (< 0.1 dB) in SI-SDR scores
  • Consistency: Both models processed the same 50 tracks from the dataset with no failures

The results demonstrate that the ONNX export successfully maintains the audio quality while providing performance improvements for deployment scenarios.

GPU Performance Comparison

Unlike the CPU comparison which uses C++, this ONNX inference was done on Python using ONNXRuntime.

Processing Speed

Metric PyTorch Model (GPU) Python ONNXRuntime (GPU) Difference
Total Processing Time 354.13 sec 386.40 sec 8.35% slower
Processing Time for 1 min input 1.70 sec 1.86 sec 8.35% slower

Audio Quality (SI-SDR in dB)

Stem PyTorch Model (dB) Python ONNXRuntime (dB) Difference
drums 9.49 9.50 +0.01
bass 7.77 7.88 +0.11
other 4.72 4.76 +0.04
vocals 7.85 7.87 +0.02
Overall 7.46 7.50 +0.04

Summary Statistics

Dataset PyTorch Model (GPU) Python ONNXRuntime (GPU)
Total Tracks 50 50
Total Audio Duration 12,470.98 sec (~3.46 hours) 12,470.98 sec (~3.46 hours)
Model Identifier htdemucs htdemucs

Key Findings

  • Performance: The PyTorch model is 8.35% faster than the ONNX model run on Python.
  • Quality: Audio separation quality is very similar, with the ONNX model showing slightly better SI-SDR scores (+0.04 dB overall)
  • Consistency: Both models processed the same 50 tracks from the dataset with no failures

The results show that on GPU, PyTorch maintains a performance advantage while both models deliver comparable audio quality.

Add Cpp scripts for running inference on ONNX exported Demucs

Signed-off-by: Anmol Mishra <anmolmishra1997@gmail.com>
Add benchmarking for PyTorch Demucs and C++ ONNX scripts -
 - Separation quality using SI-SDR
 - Performance metrics on CPU

Signed-off-by: Anmol Mishra <anmolmishra1997@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant