How to use whisper without load_audio function (with audio array which loaded by torchaudio) #1768

amitli1 · 2023-11-07T12:06:31Z

amitli1
Nov 7, 2023

In my app,
I'm getting array of audio sample (with sample rate =8000) which was loaded with torchaudio.load
I want to avoid from loading the wav file again (for efficiency) and to resample the array to 16000.

whisper.load_audio use ffmpeg to load and resample the audio to 16000.
I'm trying to use librosa or torchaudio and resample the audio array but It always seems that the resample methods are not the same.

(I assume that if I use other resample method not as the whisper model was trained on, I can get bad results).

Example:
loading test.wav file (with SR=8000) and print the 5 first cells:
whisper_audio = whisper.load_audio(file) => [-0.00082397 -0.00115967 -0.00186157 -0.00231934 -0.00222778, ...]

loading with torchaudio and resample it with librosa:
librosa.resample(vad_audio, orig_sr=8000, target_sr=16000, scale=True, res_type='kaiser_best')
=> [-0.00082317 -0.0010577 -0.0013937 -0.0016688 -0.00186235

seems different values.

How can I resample the audio in the exact way ffmpeg do it ?

glangford · 2023-11-07T14:17:54Z

glangford
Nov 7, 2023

I want to avoid from loading the wav file again (for efficiency) and to resample the array ...
How can I resample the audio in the exact way ffmpeg do it ?

fyi in case this is what you are looking for

Pipe input in to ffmpeg stdin

https://stackoverflow.com/questions/45899585/pipe-input-in-to-ffmpeg-stdin

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to use whisper without load_audio function (with audio array which loaded by torchaudio) #1768

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to use whisper without load_audio function (with audio array which loaded by torchaudio) #1768

Uh oh!

amitli1 Nov 7, 2023

Replies: 1 comment

Uh oh!

glangford Nov 7, 2023

amitli1
Nov 7, 2023

glangford
Nov 7, 2023