Replies: 1 comment
-
fyi in case this is what you are looking for Pipe input in to ffmpeg stdin |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
In my app,
I'm getting array of audio sample (with sample rate =8000) which was loaded with
torchaudio.load
I want to avoid from loading the wav file again (for efficiency) and to resample the array to 16000.
whisper.load_audio
useffmpeg
to load and resample the audio to 16000.I'm trying to use
librosa
ortorchaudio
and resample the audio array but It always seems that the resample methods are not the same.(I assume that if I use other resample method not as the whisper model was trained on, I can get bad results).
Example:
loading test.wav file (with SR=8000) and print the 5 first cells:
whisper_audio = whisper.load_audio(file)
=>[-0.00082397 -0.00115967 -0.00186157 -0.00231934 -0.00222778, ...]
loading with
torchaudio
and resample it withlibrosa
:librosa.resample(vad_audio, orig_sr=8000, target_sr=16000, scale=True, res_type='kaiser_best')
=>
[-0.00082317 -0.0010577 -0.0013937 -0.0016688 -0.00186235
seems different values.
How can I resample the audio in the exact way
ffmpeg
do it ?Beta Was this translation helpful? Give feedback.
All reactions