What file formats are supported for input? #799

rumpleniltkins · 2023-01-02T17:24:04Z

rumpleniltkins
Jan 2, 2023

Just wondering what file types are supported by the model. Are ogg vorbis files acceptable input?

Jan 3, 2023

ffmpeg is used to load audio (see the link to the code below), so the input type must be supported by ffmpeg. Note that you can also use a container such as .mp4 of audio+video as input.

whisper/whisper/audio.py

Line 22 in 28769fc

def load_audio(file: str, sr: int = SAMPLE_RATE):

View full answer

glangford · 2023-01-03T00:22:31Z

glangford
Jan 3, 2023

ffmpeg is used to load audio (see the link to the code below), so the input type must be supported by ffmpeg. Note that you can also use a container such as .mp4 of audio+video as input.

whisper/whisper/audio.py

Line 22 in 28769fc

def load_audio(file: str, sr: int = SAMPLE_RATE):

1 reply

glangford Jan 3, 2023

See ffmpeg -formats; on my platform there are 395 entries, I don't know if they all work :)

gwpl · 2023-05-01T20:08:33Z

gwpl
May 1, 2023

Can we ask for support by Whisper API for OGG/OGA and FLAC?
(sorry if I didn't find right place to file such request).

As I read on https://platform.openai.com/docs/api-reference/audio/create#audio/create-file , official API does not support currently two very nice opensource formats:

ogg/oga https://en.wikipedia.org/wiki/Vorbis
FLAC https://en.wikipedia.org/wiki/FLAC

I have tons of recordings and it as default on a lot of software and devices.

As we see from example snippet OGG/OGA is supported:

for i in *.ogg; do
  ffmpeg -acodec libvorbis -i "$i" -acodec pcm_s16le "${i%ogg}wav"
done

( from https://stackoverflow.com/a/62267248/ )

ffmpeg -i inputfile.flac output.wav

from https://stackoverflow.com/a/23380032/544721

Can we please add support in Whipser API for OGG/OGA and FLAC formats?
Looking at codebase

whisper/whisper/audio.py

Line 46 in c09a7ae

ffmpeg.input(file, threads=0)

it looks that it's not a problem on whisper python level but more on Whisper API side.

Otherwise, anyway I would be thankful for making sure it works well with github.com/openai/whisper for local deployments.

8 replies

Pachocastillosr Jun 29, 2023

+1 for OGG

jongwook Jul 12, 2023
Maintainer

FLAC and OGG are now supported in the Whisper API!

gwpl Jul 13, 2023

Thank you! That's a great news!

Shouldn't documentation get updated to celebrate this! https://platform.openai.com/docs/api-reference/audio/create#audio/create-file ?

shyamal-anadkat Jul 28, 2023

we should get the docs updated!

rattrayalex Jul 29, 2023

Proposed adding this to the OpenAPI spec here: openai/openai-openapi#64

Hobbymaker-de · 2024-01-19T12:27:58Z

What file formats are supported for input? #799

Uh oh!

Replies: 3 comments · 11 replies

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jongwook Jul 12, 2023 Maintainer

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Replies: 3 comments 11 replies

jongwook Jul 12, 2023
Maintainer