feat(docker): bundle whisper.cpp + tiny model for out-of-the-box transcription#9
Merged
Merged
Conversation
…scription Ship the whisper.cpp CLI, the ggml `tiny` model, and ffmpeg in the image so the `transcribe` tool and Telegram voice auto-transcription work with zero setup — no host install, no first-run model download. Dockerfile: - New `whisper` build stage compiles whisper-cli (OpenMP off, so the only runtime shared lib is libstdc++) and fetches ggml-tiny.bin. The model is overridable via `--build-arg WHISPER_MODEL=base|small|medium`. - Runtime stage installs ffmpeg + libstdc++ and copies whisper-cli onto PATH and the model to /usr/local/share/whisper/models/. The model is baked OUTSIDE ~/.odek on purpose: the Telegram compose profiles bind-mount ./.odek over /home/odek/.odek, which would shadow a model placed under the default ~/.odek/whisper/models. A fixed image path keeps it visible across all four profiles. Config: - config.restricted.json and config.godmode.json now set a transcription block (model: tiny, auto_transcribe: true, models_dir pointing at the baked path). Docs: document bundled transcription in docker/README.md and docs/TELEGRAM.md. Verified by building the image and running OGG->WAV->whisper inference with the bundled model as the non-root odek user. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Clone a tagged release via a WHISPER_VERSION build arg instead of tracking master, so the bundled whisper-cli is reproducible and can't drift/break in CI on an upstream change. Verified with a no-cache build (git describe → v1.8.6, whisper-cli + ggml-tiny.bin produced). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
Author
Pre-merge verification pass ✅Re-verified end to end before merge: Build (reproducible)
Runtime (as non-root
Config delivery (compose)
Unit tests
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Makes voice transcription work out of the box in the Docker image. The
transcribetool and Telegram voice auto-transcription previously required the host to have whisper.cpp + a model installed (or a first-run download). Now the image ships everything: the whisper.cpp CLI, the ggmltinymodel, andffmpeg.Changes
docker/Dockerfilewhisperbuild stage compileswhisper-clifrom source (same alpine base as runtime for musl ABI match;GGML_OPENMP=OFFso the only runtime shared lib needed islibstdc++) and fetchesggml-tiny.bin.--build-arg WHISPER_MODEL=base|small|medium.ffmpeg(OGG/Opus → WAV for whisper) +libstdc++, and copieswhisper-cliontoPATHand the model to/usr/local/share/whisper/models/.docker/config.restricted.json+docker/config.godmode.jsontranscriptionblock:model: tiny,auto_transcribe: true,models_dir: /usr/local/share/whisper/models.Docs —
docker/README.mdanddocs/TELEGRAM.mddocument the bundled setup.Why the model lives outside
~/.odekThe default model location is
~/.odek/whisper/models, but the Telegram compose profiles bind-mount./.odekover/home/odek/.odek— which would shadow a model baked under~/.odek, breaking exactly the profiles (voice notes) that need it. So the model is baked to a fixed image path (/usr/local/share/whisper/models) and the configs pointtranscription.models_dirthere, keeping it visible across all four profiles.Verification
Built the image locally and confirmed, running as the non-root
odekuser:whisper-cliis onPATHand executes (ldd→ onlylibstdc++/libgcc_s+ musl)tinymodel is readable at the configured pathffmpegis presentffmpegto WAV →whisper-cli --output-jsoninference with the bundled model, JSON written ✅Image size impact: ~78 MB model + ~2.7 MB binary + ffmpeg/libstdc++.
🤖 Generated with Claude Code