Skip to content

VAD not functional with whisper_full_with_state #3402

@tazz4843

Description

@tazz4843

I ran into this issue while debugging whisper-rs not having any sort of VAD working. It turns out whisper.cpp just doesn't run VAD when using the state variants of whisper_full. Was there a reason for this?

See

whisper.cpp/src/whisper.cpp

Lines 7722 to 7743 in 7745fcf

int whisper_full(
struct whisper_context * ctx,
struct whisper_full_params params,
const float * samples,
int n_samples) {
std::vector<float> vad_samples;
if (params.vad) {
WHISPER_LOG_INFO("%s: VAD is enabled, processing speech segments only\n", __func__);
if (!whisper_vad(ctx, ctx->state, params, samples, n_samples, vad_samples)) {
WHISPER_LOG_ERROR("%s: failed to compute VAD\n", __func__);
return -1;
}
if (vad_samples.empty()) {
ctx->state->result_all.clear();
return 0;
}
samples = vad_samples.data();
n_samples = vad_samples.size();
}
return whisper_full_with_state(ctx, ctx->state, params, samples, n_samples);
}
and note how it calls VAD, then defers to the state function, while the state function (

whisper.cpp/src/whisper.cpp

Lines 6804 to 6809 in 7745fcf

int whisper_full_with_state(
struct whisper_context * ctx,
struct whisper_state * state,
struct whisper_full_params params,
const float * samples,
int n_samples) {
) has no VAD code anywhere in itself.

Also see https://codeberg.org/tazz4843/whisper-rs/issues/242

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions