Skip to content

Conversation

sachaarbonel
Copy link
Contributor

Summary

Adds a CLI flag --warmup-file to the whisper.cpp server that allows running a transcription on a specified audio file during startup to warm up the model before serving requests.

Changes

  • Added warmup_file parameter to whisper_params struct
  • Added -wf/--warmup-file CLI flag to specify path to warmup audio file
  • Implemented warmup logic that runs a quick transcription during model initialization
  • Added help text documentation for the new flag

Implementation Details

  • Warmup runs after model initialization but before server starts accepting requests
  • Uses optimized parameters for fast warmup (single segment, no context, minimal audio context)
  • Provides timing feedback showing how long the warmup took
  • Gracefully handles warmup failures with warning messages
  • Only executes warmup when --warmup-file flag is provided

Usage

./whisper-server --warmup-file samples/jfk.wav --model models/ggml-base.en.bin

Benefits

  • Reduces latency for first transcription request
  • Ensures model is fully loaded and optimized before serving
  • Useful for production deployments where consistent response times are important

This change is backward compatible and doesn't affect existing functionality when the warmup flag is not used.

@ggerganov
Copy link
Member

This warmup logic can be delegated to the script that starts the server process. After it starts the server, simply send a transcription request with a short audio file to do the warmup. This is a better alternative because we won't have to maintain this logic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants