Skip to content

runtime error with docker "quickest start" method #426

@compactify

Description

@compactify

I used docker run -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-cpu:latest on my linux and windows hosts, and twice got the same error while trying to generate a simple speech string.

screenshot attached, console log follows:

   ~  sudo docker run -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-cpu:latest  126 ✘
[sudo] Passwort für marc:
Unable to find image 'ghcr.io/remsky/kokoro-fastapi-cpu:latest' locally
latest: Pulling from remsky/kokoro-fastapi-cpu
dad67da3f26b: Pull complete
4b03b4f4fa5c: Pull complete
50a956a18493: Pull complete
c14326ed6c85: Pull complete
f736c347b6d1: Pull complete
acca9d062841: Pull complete
57f685d63b45: Pull complete
4f4fb700ef54: Pull complete
7e2a369719f4: Pull complete
00b35d20c8b3: Pull complete
79af734a0a7a: Pull complete
2b0293d516ae: Pull complete
b1e321f601d5: Pull complete
9b6e073b1eb4: Pull complete
c70ad950ce72: Pull complete
df9f44c9396f: Pull complete
Digest: sha256:c8812546d358cbfd6a5c4087a28795b2b001d8e32d7a322eedd246e6bc13cb55
Status: Downloaded newer image for ghcr.io/remsky/kokoro-fastapi-cpu:latest
2025-12-07 07:34:11.764 | INFO | main:download_model:60 - Model files already exist and are valid
INFO: Started server process [11]
INFO: Waiting for application startup.
07:34:22 AM | INFO | main:57 | Loading TTS model and voice packs...
07:34:22 AM | INFO | model_manager:38 | Initializing Kokoro V1 on cpu
07:34:22 AM | DEBUG | paths:101 | Searching for model in path: /app/api/src/models
07:34:22 AM | INFO | kokoro_v1:46 | Loading Kokoro model on cpu
07:34:22 AM | INFO | kokoro_v1:47 | Config path: /app/api/src/models/v1_0/config.json
07:34:22 AM | INFO | kokoro_v1:48 | Model path: /app/api/src/models/v1_0/kokoro-v1_0.pth
WARNING: Defaulting repo_id to hexgrad/Kokoro-82M. Pass repo_id='hexgrad/Kokoro-82M' to suppress this warning.
/app/.venv/lib/python3.10/site-packages/torch/nn/modules/rnn.py:123: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.2 and num_layers=1
warnings.warn(
/app/.venv/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py:143: FutureWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
WeightNorm.apply(module, name, dim)
07:34:23 AM | DEBUG | paths:153 | Scanning for voices in path: /app/api/src/voices/v1_0
07:34:23 AM | DEBUG | paths:131 | Searching for voice in path: /app/api/src/voices/v1_0
07:34:23 AM | DEBUG | model_manager:77 | Using default voice 'af_heart' for warmup
07:34:23 AM | INFO | kokoro_v1:81 | Creating new pipeline for language code: a
WARNING: Defaulting repo_id to hexgrad/Kokoro-82M. Pass repo_id='hexgrad/Kokoro-82M' to suppress this warning.
07:34:23 AM | DEBUG | kokoro_v1:261 | Generating audio for text with lang_code 'a': 'Warmup text for initialization.'
07:34:24 AM | DEBUG | kokoro_v1:268 | Got audio chunk with shape: torch.Size([57600])
07:34:24 AM | INFO | model_manager:84 | Warmup completed in 1958ms
07:34:24 AM | INFO | main:106 |

░░░░░░░░░░░░░░░░░░░░░░░░

╔═╗┌─┐┌─┐┌┬┐
╠╣ ├─┤└─┐ │ 
╚  ┴ ┴└─┘ ┴
╦╔═┌─┐┬┌─┌─┐
╠╩╗│ │├┴┐│ │
╩ ╩└─┘┴ ┴└─┘

░░░░░░░░░░░░░░░░░░░░░░░░

Model warmed up on cpu: kokoro_v1
Running on CPU
67 voice packs loaded

Beta Web Player: http://0.0.0.0:8880/web/
or http://localhost:8880/web/
░░░░░░░░░░░░░░░░░░░░░░░░

INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8880 (Press CTRL+C to quit)
07:35:42 AM | DEBUG | paths:307 | Searching for web file in path: /app/web
INFO: 172.17.0.1:56508 - "GET /web/ HTTP/1.1" 200 OK
07:35:42 AM | DEBUG | paths:307 | Searching for web file in path: /app/web
07:35:42 AM | DEBUG | paths:307 | Searching for web file in path: /app/web
07:35:42 AM | DEBUG | paths:307 | Searching for web file in path: /app/web
07:35:42 AM | DEBUG | paths:307 | Searching for web file in path: /app/web
07:35:42 AM | DEBUG | paths:307 | Searching for web file in path: /app/web
INFO: 172.17.0.1:56508 - "GET /web/styles/base.css HTTP/1.1" 200 OK
07:35:42 AM | DEBUG | paths:307 | Searching for web file in path: /app/web
07:35:42 AM | DEBUG | paths:307 | Searching for web file in path: /app/web
INFO: 172.17.0.1:56512 - "GET /web/styles/layout.css HTTP/1.1" 200 OK
INFO: 172.17.0.1:56522 - "GET /web/styles/header.css HTTP/1.1" 200 OK
07:35:42 AM | DEBUG | paths:307 | Searching for web file in path: /app/web
07:35:42 AM | DEBUG | paths:307 | Searching for web file in path: /app/web
INFO: 172.17.0.1:56552 - "GET /web/styles/player.css HTTP/1.1" 200 OK
INFO: 172.17.0.1:56536 - "GET /web/styles/forms.css HTTP/1.1" 200 OK
INFO: 172.17.0.1:56566 - "GET /web/styles/responsive.css HTTP/1.1" 200 OK
07:35:42 AM | DEBUG | paths:307 | Searching for web file in path: /app/web
INFO: 172.17.0.1:56508 - "GET /web/styles/badges.css HTTP/1.1" 200 OK
INFO: 172.17.0.1:56512 - "GET /web/styles/controls.css HTTP/1.1" 200 OK
INFO: 172.17.0.1:56522 - "GET /web/siriwave.js HTTP/1.1" 200 OK
INFO: 172.17.0.1:56552 - "GET /web/src/App.js HTTP/1.1" 200 OK
07:35:42 AM | DEBUG | paths:307 | Searching for web file in path: /app/web
07:35:42 AM | DEBUG | paths:307 | Searching for web file in path: /app/web
07:35:42 AM | DEBUG | paths:307 | Searching for web file in path: /app/web
07:35:42 AM | DEBUG | paths:307 | Searching for web file in path: /app/web
07:35:42 AM | DEBUG | paths:307 | Searching for web file in path: /app/web
07:35:42 AM | DEBUG | paths:307 | Searching for web file in path: /app/web
INFO: 172.17.0.1:56512 - "GET /web/src/services/AudioService.js HTTP/1.1" 200 OK
INFO: 172.17.0.1:56552 - "GET /web/src/services/VoiceService.js HTTP/1.1" 200 OK
INFO: 172.17.0.1:56508 - "GET /web/src/components/WaveVisualizer.js HTTP/1.1" 200 OK
07:35:42 AM | DEBUG | paths:307 | Searching for web file in path: /app/web
INFO: 172.17.0.1:56536 - "GET /web/src/state/PlayerState.js HTTP/1.1" 200 OK
INFO: 172.17.0.1:56522 - "GET /web/src/components/PlayerControls.js HTTP/1.1" 200 OK
INFO: 172.17.0.1:56566 - "GET /web/src/components/VoiceSelector.js HTTP/1.1" 200 OK
INFO: 172.17.0.1:56512 - "GET /web/src/components/TextEditor.js HTTP/1.1" 200 OK
07:35:43 AM | INFO | openai_compatible:70 | Created global TTSService instance
07:35:43 AM | DEBUG | paths:153 | Scanning for voices in path: /app/api/src/voices/v1_0
INFO: 172.17.0.1:56512 - "GET /v1/audio/voices HTTP/1.1" 200 OK
07:35:43 AM | DEBUG | paths:307 | Searching for web file in path: /app/web
INFO: 172.17.0.1:56512 - "GET /web/favicon.svg HTTP/1.1" 200 OK
07:35:58 AM | DEBUG | paths:153 | Scanning for voices in path: /app/api/src/voices/v1_0
07:35:58 AM | DEBUG | streaming_audio_writer:40 | Disabling Xing VBR header for MP3 encoding.
INFO: 172.17.0.1:44624 - "POST /v1/audio/speech HTTP/1.1" 200 OK
07:35:58 AM | DEBUG | paths:153 | Scanning for voices in path: /app/api/src/voices/v1_0
07:35:58 AM | DEBUG | paths:131 | Searching for voice in path: /app/api/src/voices/v1_0
07:35:58 AM | DEBUG | tts_service:204 | Using single voice path: /app/api/src/voices/v1_0/af_alloy.pt
07:35:58 AM | DEBUG | tts_service:280 | Using voice path: /app/api/src/voices/v1_0/af_alloy.pt
07:35:58 AM | INFO | tts_service:284 | Using lang_code 'a' for voice 'af_alloy' in audio stream
07:35:58 AM | INFO | text_processor:159 | Starting smart split for 15 chars
07:35:58 AM | DEBUG | text_processor:164 | Split raw text into 1 parts by pause tags.
07:35:58 AM | DEBUG | text_processor:65 | Total processing took 27.52ms for chunk: 'hello my friend'
07:35:58 AM | INFO | text_processor:308 | Yielding final chunk 1 for part: 'hello my friend' (17 tokens)
07:35:58 AM | DEBUG | kokoro_v1:261 | Generating audio for text with lang_code 'a': 'hello my friend'
07:35:59 AM | DEBUG | kokoro_v1:268 | Got audio chunk with shape: torch.Size([51000])
07:35:59 AM | INFO | text_processor:332 | Split completed in 733.36ms, produced 1 chunks (including pauses)
07:35:59 AM | DEBUG | streaming_audio_writer:85 | Muxed final packets.
07:36:11 AM | DEBUG | paths:153 | Scanning for voices in path: /app/api/src/voices/v1_0
07:36:11 AM | DEBUG | streaming_audio_writer:40 | Disabling Xing VBR header for MP3 encoding.
INFO: 172.17.0.1:57318 - "POST /v1/audio/speech HTTP/1.1" 200 OK
07:36:11 AM | DEBUG | paths:153 | Scanning for voices in path: /app/api/src/voices/v1_0
07:36:11 AM | DEBUG | paths:131 | Searching for voice in path: /app/api/src/voices/v1_0
07:36:11 AM | DEBUG | tts_service:204 | Using single voice path: /app/api/src/voices/v1_0/af_alloy.pt
07:36:11 AM | DEBUG | tts_service:280 | Using voice path: /app/api/src/voices/v1_0/af_alloy.pt
07:36:11 AM | INFO | tts_service:284 | Using lang_code 'a' for voice 'af_alloy' in audio stream
07:36:11 AM | INFO | text_processor:159 | Starting smart split for 15 chars
07:36:11 AM | DEBUG | text_processor:164 | Split raw text into 1 parts by pause tags.
07:36:11 AM | DEBUG | text_processor:65 | Total processing took 0.23ms for chunk: 'hello my friend'
07:36:11 AM | INFO | text_processor:308 | Yielding final chunk 1 for part: 'hello my friend' (17 tokens)
07:36:11 AM | DEBUG | kokoro_v1:261 | Generating audio for text with lang_code 'a': 'hello my friend'
07:36:11 AM | DEBUG | kokoro_v1:268 | Got audio chunk with shape: torch.Size([51000])
07:36:11 AM | INFO | text_processor:332 | Split completed in 563.03ms, produced 1 chunks (including pauses)
07:36:11 AM | DEBUG | streaming_audio_writer:85 | Muxed final packets.
^CINFO: Shutting down

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions