Add support for --carry-initial-prompt #3395

alubbe · 2025-08-28T20:20:47Z

This PR is bringing over the --carry-initial-prompt flag from the python library (openai/whisper#2343)

By default, an --prompt (initial prompt) is only used for the first decoding window; subsequent windows rely on the text generated so far for continuity. When you pass --carry-initial-prompt, the initial prompt tokens are explicitly prepended to every internal decode window. This mirrors the Python reference implementation's carry_initial_prompt behavior and can help enforce custom vocabulary or style throughout long transcriptions. Trade‑off: it may slightly reduce the model's ability to adapt dynamically to newly generated context (can increase risk of repetitions if the prompt is long). If the combined size of the carried initial prompt and the rolling context exceeds half the model text context, the leftmost (oldest) part of the initial prompt is truncated to fit.

KitaitiMakoto

Patches for Ruby are nice, though I'm not sure the essential changes and API are accepted. Let me point just a thing.

bindings/ruby/ext/ruby_whisper_params.c

alubbe · 2025-09-08T11:02:35Z

Changes applied - let me know what you think of this PR

alubbe · 2025-09-25T08:51:39Z

@ggerganov could I ask you to review this PR?

alubbe · 2025-10-07T13:14:21Z

any update on this? Happy to resolve the conflict in cli.cpp if I can get a general feeling of whether you're interested in merging this functionality or not. We've been using this branch for weeks now and have observed a much improved transcription from whisper when it comes to unusual names (people, places, companies, etc.) from carrying the initial prompt.

ggerganov · 2025-10-08T08:14:22Z

Hi, apologies for the long wait. I'm interested in adding this functionality, but I am having difficulty following the implemented logic for prepending the initial prompt. Would like to see this simplified in some way. I'll try to add some suggestions how to improve it.

ggerganov

I think the main complexity comes from using a single prompt_past vector in the whisper_state which results in some convoluted logic for deduplicating and slicing the tokens.

I expect that the logic can become much simpler if you replace prompt_past with 2 vectors: prompt_past0 and prompt_past1. The full prompt is a concatenation of prompt_past0 + prompt_past1. The prompt_past0 can be utilized to store some static prefix - i.e. the original prompt that is being carried.

# Conflicts: # examples/cli/cli.cpp

alubbe · 2025-10-08T10:28:21Z

That's a good point. I tried taking a bit further and simplifying it as much as I could - what do you think?

src/whisper.cpp

alubbe · 2025-10-08T13:42:44Z

Pushed PR fixes - let me know what you think

src/whisper.cpp

alubbe · 2025-10-10T08:08:34Z

I've included your suggestion, and made two more further simplification

Since n_take0 = std::min<int>(max_ctx_half - 1, prompt_past0.size()), we know n_take0 <= prompt_past0.size(), so we can get rid of that ternary.
Inserting from an empty range is a no-op, so these checks are redundant, so we can remove the if (n_take0 > 0) and if (n_take1 > 0) checks

src/whisper.cpp

alubbe · 2025-10-10T09:07:10Z

I did another check of all the logic in here and I think I found another issue - we only want to run line 7569 outside the carry_initial_prompt mode because prompt_past1 contains part of prompt_past0, so we're duplicating content

src/whisper.cpp

ggerganov · 2025-10-10T09:31:56Z

src/whisper.cpp

+            // update prompt_past1
+            prompt_past1.clear();
+            if (!params.carry_initial_prompt && !prompt.empty() && prompt.front() == whisper_token_prev(ctx)) {
+                prompt_past1.insert(prompt_past1.end(), prompt.begin() + 1, prompt.end() - prompt_init.size());


An alternative would be to preserve the original behaviour by using the fact that prompt.size() > prompt_past0.size():

Suggested change

// update prompt_past1

prompt_past1.clear();

if (!params.carry_initial_prompt && !prompt.empty() && prompt.front() == whisper_token_prev(ctx)) {

prompt_past1.insert(prompt_past1.end(), prompt.begin() + 1, prompt.end() - prompt_init.size());

// update prompt_past1

prompt_past1.clear();

if (!prompt.empty() && prompt.front() == whisper_token_prev(ctx)) {

prompt_past1.insert(prompt_past1.end(), prompt.begin() + 1 + prompt_past0.size(), prompt.end() - prompt_init.size());

Though make sure to double-check this works as expected.

I think that this assumes the entire prompt_past0 is always included in the prompt, but that's not guaranteed. For example, if max_ctx_half - 1 < prompt_past0.size(), we only take a tail of prompt_past0, not all of it.

Yes, good point. Should we truncate prompt_past0 upon initialization so that it does not exceed the max_ctx_half?

Co-authored-by: Georgi Gerganov <[email protected]>

src/whisper.cpp

Add support for --carry-initial-prompt

9fad5d5

alubbe mentioned this pull request Aug 28, 2025

Wrong implementation of carry_initial_prompt #2684

Closed

KitaitiMakoto reviewed Sep 3, 2025

View reviewed changes

bindings/ruby/ext/ruby_whisper_params.c Outdated Show resolved Hide resolved

PR fixes for ruby and go

02714dd

alubbe requested a review from KitaitiMakoto September 8, 2025 11:02

Refactoring for readability

e7468c2

ggerganov reviewed Oct 8, 2025

View reviewed changes

alubbe added 3 commits October 8, 2025 11:05

Merge branch 'master' into carry_initial_prompt-v2

a69967d

# Conflicts: # examples/cli/cli.cpp

WIP 1

8be27dc

WIP 2

8abf7d9

ggerganov reviewed Oct 8, 2025

View reviewed changes

PR fixes

44880cb

ggerganov reviewed Oct 8, 2025

View reviewed changes

src/whisper.cpp Outdated Show resolved Hide resolved

More PR fixes

e42cbed

ggerganov reviewed Oct 8, 2025

View reviewed changes

src/whisper.cpp Outdated Show resolved Hide resolved

alubbe added 2 commits October 10, 2025 09:33

PR fix

ee5adba

Further simplification

6417091

ggerganov reviewed Oct 10, 2025

View reviewed changes

src/whisper.cpp Show resolved Hide resolved

alubbe added 2 commits October 10, 2025 10:50

d'oh

bd48561

One more logic fix

037b419

ggerganov reviewed Oct 10, 2025

View reviewed changes

src/whisper.cpp Show resolved Hide resolved

ggerganov reviewed Oct 10, 2025

View reviewed changes

Update src/whisper.cpp

f6139df

Co-authored-by: Georgi Gerganov <[email protected]>

ggerganov reviewed Oct 10, 2025

View reviewed changes

src/whisper.cpp Show resolved Hide resolved

Add support for --carry-initial-prompt #3395

Are you sure you want to change the base?

Add support for --carry-initial-prompt #3395

Uh oh!

Conversation

alubbe commented Aug 28, 2025

Uh oh!

KitaitiMakoto left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

alubbe commented Sep 8, 2025

Uh oh!

alubbe commented Sep 25, 2025

Uh oh!

alubbe commented Oct 7, 2025

Uh oh!

ggerganov commented Oct 8, 2025

Uh oh!

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

alubbe commented Oct 8, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alubbe commented Oct 8, 2025

Uh oh!

Uh oh!

Uh oh!

alubbe commented Oct 10, 2025

Uh oh!

Uh oh!

alubbe commented Oct 10, 2025

Uh oh!

Uh oh!

ggerganov Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

alubbe Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

ggerganov Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!