Skip to content

[feat] Add Cpp scripts for inference#7

Open
dhunstack wants to merge 1 commit intomixxxdj:mainfrom
dhunstack:onnxrt
Open

[feat] Add Cpp scripts for inference#7
dhunstack wants to merge 1 commit intomixxxdj:mainfrom
dhunstack:onnxrt

Conversation

@dhunstack
Copy link
Collaborator

Add Cpp scripts for running inference on ONNX exported Demucs

Copy link
Member

@acolombier acolombier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, we should trim down the C++ code to a minimum (remove edge case and redundant libraries) and move this into an examples/cpp folder instead.
The C++ doesn't add any value to demucs itself, and instead shows how to infer an ORT model (+ load/save an audio file with libnyquist). THe current cppscript is misleading as it sounds like if the C++ could be used fort any demucs development

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this used for? I cannot see any reference to demucsonnx::stft and demucsonnx::istft. My understanding is that the stft and istft function are already part of the model, is that intending to swap them with explicit implementation? If yes, how does that work with GPU/NPU devices?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify the usecase for this dependency? Looking at this example, it looks like the ONNX library already contains its own data type to manage tensor, which can create from interleaved 32-bits float audio samples. What is the added benefits?

Comment on lines +82 to +103
// create a struct to hold two float vectors for left and right channels
Eigen::MatrixXf ret(2, N);

if (fileData->channelCount == 1)
{
// Mono case
for (std::size_t i = 0; i < N; ++i)
{
ret(0, i) = fileData->samples[i]; // left channel
ret(1, i) = fileData->samples[i]; // right channel
}
}
else
{
// Stereo case
for (std::size_t i = 0; i < N; ++i)
{
ret(0, i) = fileData->samples[2 * i]; // left channel
ret(1, i) = fileData->samples[2 * i + 1]; // right channel
}
}

Copy link
Member

@acolombier acolombier Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks suboptimal - Instead, you should create tensor directly from the fileData->samples using Ort::Value::CreateTensor, which is already an interleaved buffer. In case of Mono, you could reorganise the buffer (realloc, iterate to reorganise in place). This should be a good usecase of SIMD to optimise, which I believe it cannot at the moment due to the () operator.

Add Cpp scripts for running inference on ONNX exported Demucs

Signed-off-by: Anmol Mishra <anmolmishra1997@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants