Merge pull request #707 from alonsosilvaallende/add-whisperfile-server-documentation

cjpais · web-flow · commit 0e529dd35abb · 2025-03-14T11:32:09.000-07:00
Add whisperfile server documentation
diff --git a/whisper.cpp/doc/index.md b/whisper.cpp/doc/index.md
@@ -11,3 +11,4 @@ by Georgi Gerganov, et al.
 - [Packaging](packaging.md)
 - [Using GPUs](gpu.md)
 - [Speech Translation](translate.md)
+- [Server](server.md)
diff --git a/whisper.cpp/doc/server.md b/whisper.cpp/doc/server.md
@@ -0,0 +1,70 @@
+# whisperfile http server
+
+Simple http server. WAV Files are passed to the inference model via http requests. MP3, FLAC, and OGG files are automatically converted to WAV format via miniaudio.
+
+https://github.com/ggerganov/whisper.cpp/assets/1991296/e983ee53-8741-4eb5-9048-afe5e4594b8f
+
+## Usage
+
+Download the latest release (e.g. `whisperfile-0.9.0`), a model (e.g. `whisper-tiny.en-q5_1.bin`) and then:
+
+```
+./whisperfile-0.9.0 -m whisper-tiny.en-q5_1.bin
+
+usage: ./whisperfile-0.9.0 -m whisper-tiny.en-q5_1.bin [options]
+
+options:
+  -h,        --help              [default] show this help message and exit
+  -t N,      --threads N         [4      ] number of threads to use during computation
+  -p N,      --processors N      [1      ] number of processors to use during computation
+  -ot N,     --offset-t N        [0      ] time offset in milliseconds
+  -on N,     --offset-n N        [0      ] segment index offset
+  -d  N,     --duration N        [0      ] duration of audio to process in milliseconds
+  -mc N,     --max-context N     [-1     ] maximum number of text context tokens to store
+  -ml N,     --max-len N         [0      ] maximum segment length in characters
+  -sow,      --split-on-word     [false  ] split on word rather than on token
+  -bo N,     --best-of N         [2      ] number of best candidates to keep
+  -bs N,     --beam-size N       [-1     ] beam size for beam search
+  -wt N,     --word-thold N      [0.01   ] word timestamp probability threshold
+  -et N,     --entropy-thold N   [2.40   ] entropy threshold for decoder fail
+  -lpt N,    --logprob-thold N   [-1.00  ] log probability threshold for decoder fail
+  -debug,    --debug-mode        [false  ] enable debug mode (eg. dump log_mel)
+  -tr,       --translate         [false  ] translate from source language to english
+  -di,       --diarize           [false  ] stereo audio diarization
+  -tdrz,     --tinydiarize       [false  ] enable tinydiarize (requires a tdrz model)
+  -nf,       --no-fallback       [false  ] do not use temperature fallback while decoding
+  -ps,       --print-special     [false  ] print special tokens
+  -pc,       --print-colors      [false  ] print colors
+  -pr,       --print-realtime    [false  ] print output in realtime
+  -pp,       --print-progress    [false  ] print progress
+  -nt,       --no-timestamps     [false  ] do not print timestamps
+  -l LANG,   --language LANG     [en     ] spoken language ('auto' for auto-detect)
+  -dl,       --detect-language   [false  ] exit after automatically detecting language
+             --prompt PROMPT     [       ] initial prompt
+  -m FNAME,  --model FNAME       [models/ggml-base.en.bin] model path
+  -oved D,   --ov-e-device DNAME [CPU    ] the OpenVINO device used for encode inference
+  --host HOST,                   [127.0.0.1] Hostname/ip-adress for the server
+  --port PORT,                   [8080   ] Port number for the server
+```
+
+> [!WARNING]
+> **Do not run the server example with administrative privileges and ensure it's operated in a sandbox environment, especially since it involves risky operations like accepting user file uploads. Always validate and sanitize inputs to guard against potential security threats.**
+
+## request examples
+
+**/inference**
+```
+curl 127.0.0.1:8080/inference \
+-H "Content-Type: multipart/form-data" \
+-F file="@<file-path>" \
+-F temperature="0.0" \
+-F temperature_inc="0.2" \
+-F response_format="json"
+```
+
+**/load**
+```
+curl 127.0.0.1:8080/load \
+-H "Content-Type: multipart/form-data" \
+-F model="<path-to-model-file>"
+```