@@ -40,19 +40,23 @@ The `--no-prints` is optional. It's helpful in avoiding a lot of verbose
4040logging and statistical information from being printed, which is useful
4141when writing shell scripts.
4242
43- ## Converting MP3 to WAV
43+ ## Supported Audio Formats
4444
45- Whisperfile only currently understands .wav files. So if you have files
46- in a different audio format, you need to convert them to wav beforehand.
47- One great tool for doing that is sox (your swiss army knife for audio).
48- It's easily installed and used on Debian systems as follows:
45+ Whisperfile prefers that the input file be a 16khz .wav file with 16-bit
46+ signed linear samples that's stereo or mono. Otherwise it'll attempt to
47+ convert your audiofile automatically using an internal library. The MP3,
48+ FLAC, and Ogg Vorbis Theora formats are supported across platforms.
49+
50+ For example, here's an audio recording of a famous poem in MP3 format:
4951
5052```
51- sudo apt install sox libsox-fmt-all
5253wget https://archive.org/download/raven/raven_poe_64kb.mp3
53- sox raven_poe_64kb.mp3 -r 16k raven_poe_64kb.wav
54+ o//whisper.cpp/main -m whisper-tiny.en-q5_1.bin -f raven_poe_64kb.mp3 -pc
5455```
5556
57+ Here we also passed the ` -pc ` flag to get color-coded terminal output
58+ which communicates the confidence of transcription.
59+
5660## Higher Quality Models
5761
5862The tiny model may get some words wrong. For example, it might think
@@ -61,14 +65,14 @@ enables whisperfile to decode The Raven perfectly. However it's slower.
6165
6266```
6367wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-medium.en.bin
64- o//whisper.cpp/main -m ggml-medium.en.bin -f raven_poe_64kb.wav --no-prints
68+ o//whisper.cpp/main -m ggml-medium.en.bin -f raven_poe_64kb.mp3 --no-prints
6569```
6670
6771Lastly, there's the large model, which is the best, but also slowest.
6872
6973```
7074wget -O whisper-large-v3.bin https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3.bin
71- o//whisper.cpp/main -m whisper-large-v3.bin -f raven_poe_64kb.wav --no-prints
75+ o//whisper.cpp/main -m whisper-large-v3.bin -f raven_poe_64kb.mp3 --no-prints
7276```
7377
7478## Installation
0 commit comments