whisper && ffmpeg translate error conversion failed! #2628
Replies: 3 comments
-
I tried reinstalling python, ffmpeg, and even whisper. After installation, I tried to use them, but still got an error! |
Beta Was this translation helpful? Give feedback.
0 replies
-
But I found a very strange problem. For the files I had translated before, when I tried to translate them again, the whisper command displayed normally. However, when I tried to translate new files, it kept reporting errors. Could it be due to caching? |
Beta Was this translation helpful? Give feedback.
0 replies
-
#!/bin/bash echo "🔍 Checking ffmpeg version..." FFMPEG_VERSION=$(ffmpeg -version | head -n 1 | awk '{print $3}') NEEDS_DOWNGRADE=$(echo "$FFMPEG_VERSION" | awk -F. '{ if ($1 >= 7) print "yes"; else print "no"; }') if [ "$NEEDS_DOWNGRADE" = "yes" ]; then echo "
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I was using whisper smoothly before. Yesterday, for some unknown reason, it suddenly stopped working. I tried to check the error message, but still couldn't figure out the specific cause. I don't know if it's a version compatibility issue. My python version is 3.9.9. The version of ffmpeg is 7.1.1. My computer chip is an Apple M2. Below is the specific error message. I wonder if any expert can help locate the problem:
Traceback (most recent call last):
File "/Users/sam.jia/.pyenv/versions/3.9.9/lib/python3.9/site-packages/whisper/audio.py", line 58, in load_audio
out = run(cmd, capture_output=True, check=True).stdout
File "/Users/sam.jia/.pyenv/versions/3.9.9/lib/python3.9/subprocess.py", line 528, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ffmpeg', '-nostdin', '-threads', '0', '-i', 'test.mp4', '-f', 's16le', '-ac', '1', '-acodec', 'pcm_s16le', '-ar', '16000', '-']' returned non-zero exit status 234.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/sam.jia/.pyenv/versions/3.9.9/lib/python3.9/site-packages/whisper/transcribe.py", line 615, in cli
result = transcribe(model, audio_path, temperature=temperature, **args)
File "/Users/sam.jia/.pyenv/versions/3.9.9/lib/python3.9/site-packages/whisper/transcribe.py", line 139, in transcribe
mel = log_mel_spectrogram(audio, model.dims.n_mels, padding=N_SAMPLES)
File "/Users/sam.jia/.pyenv/versions/3.9.9/lib/python3.9/site-packages/whisper/audio.py", line 140, in log_mel_spectrogram
audio = load_audio(audio)
File "/Users/sam.jia/.pyenv/versions/3.9.9/lib/python3.9/site-packages/whisper/audio.py", line 60, in load_audio
raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e
RuntimeError: Failed to load audio: ffmpeg version 7.1.1 Copyright (c) 2000-2025 the FFmpeg developers
built with Apple clang version 15.0.0 (clang-1500.1.0.2.5)
configuration: --prefix=/opt/homebrew/Cellar/ffmpeg/7.1.1_3 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags='-Wl,-ld_classic' --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libaribb24 --enable-libbluray --enable-libdav1d --enable-libharfbuzz --enable-libjxl --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-videotoolbox --enable-audiotoolbox --enable-neon
libavutil 59. 39.100 / 59. 39.100
libavcodec 61. 19.101 / 61. 19.101
libavformat 61. 7.100 / 61. 7.100
libavdevice 61. 3.100 / 61. 3.100
libavfilter 10. 4.100 / 10. 4.100
libswscale 8. 3.100 / 8. 3.100
libswresample 5. 3.100 / 5. 3.100
libpostproc 58. 3.100 / 58. 3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'test.mp4':
Metadata:
major_brand : mp42
minor_version : 1
compatible_brands: isommp41mp42
creation_time : 2025-07-27T01:32:30.000000Z
Duration: 00:35:36.04, start: 0.000000, bitrate: 6124 kb/s
Stream #0:00x1: Video: h264 (High) (avc1 / 0x31637661), yuv420p(progressive), 1920x1080 [SAR 1:1 DAR 16:9], 5990 kb/s, 30 fps, 30 tbr, 15360 tbn (default)
Metadata:
creation_time : 2025-07-27T01:32:30.000000Z
handler_name : Core Media Video
vendor_id : [0][0][0][0]
Stream #0:10x2: Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 128 kb/s (default)
Metadata:
creation_time : 2025-07-27T01:32:30.000000Z
handler_name : Core Media Audio
vendor_id : [0][0][0][0]
Stream mapping:
Stream #0:1 -> #0:0 (aac (native) -> pcm_s16le (native))
Output #0, s16le, to 'pipe:':
Metadata:
major_brand : mp42
minor_version : 1
compatible_brands: isommp41mp42
encoder : Lavf61.7.100
Stream #0:0(und): Audio: pcm_s16le, 16000 Hz, mono, s16, 256 kb/s (default)
Metadata:
creation_time : 2025-07-27T01:32:30.000000Z
handler_name : Core Media Audio
vendor_id : [0][0][0][0]
encoder : Lavc61.19.101 pcm_s16le
[s16le @ 0x156e0f0e0] Application provided invalid, non monotonically increasing dts to muxer in stream 0: 5534 >= 5534
[aac @ 0x156e10f80] Sample rate index in program config element does not match the sample rate index configured by the container.
[aist#0:1/aac @ 0x156e05f60] [dec:aac @ 0x156e109a0] Error submitting packet to decoder: Invalid data found when processing input
[aac @ 0x156e10f80] channel element 2.1 is not allocated
[aist#0:1/aac @ 0x156e05f60] [dec:aac @ 0x156e109a0] Error submitting packet to decoder: Invalid data found when processing input
[aac @ 0x156e10f80] Prediction is not allowed in AAC-LC.
[aist#0:1/aac @ 0x156e05f60] [dec:aac @ 0x156e109a0] Error submitting packet to decoder: Invalid data found when processing input
decoder: Invalid data found when processing input
[out#0/s16le @ 0x600003768240] video:0KiB audio:47849KiB subtitle:0KiB other streams:0KiB global headers:0KiB muxing overhead: 0.000000%
size= 47849KiB time=00:25:49.70 bitrate= 252.9kbits/s speed=1.4e+03x
Conversion failed!
Beta Was this translation helpful? Give feedback.
All reactions