How can i used pydub output as Whisper audio input ? #983
Unanswered
MarriamSiddiqui
asked this question in
Q&A
Replies: 2 comments 1 reply
-
This thread may be helpful |
Beta Was this translation helpful? Give feedback.
0 replies
-
import whisper
import numpy as np
from pydub import AudioSegment
model = whisper.load_model("base")
audio = AudioSegment.from_file("audio.wav")
result = model.transcribe(np.frombuffer(audio.raw_data, np.int16).flatten().astype(np.float32) / 32768.0)
print(result["text"]) |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I am obtaining the audio from the following function. I then need to use the Whisper model to transcribe the audio. How can I use the output of the following function into the whisper model?
Note: I cannot save the audio file (even temporarily) due to latency issue.
def extract_audio_buffer(data_arg):
sequence_id['audio_id'] += 1
# Create a buffer to hold the audio data
audio_buffer = io.BytesIO()
Beta Was this translation helpful? Give feedback.
All reactions