Passing Whisper Bytes Instead of Filenames #1507
Unanswered
Codie-Petersen
asked this question in
Q&A
Replies: 1 comment
-
u should check whisper source file |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I’m developing a speech recognition subsystem using Whisper as the transcriber. I’m developing it on my local environment. I’ve created a Voice Activity Detection algorithm that picks up only voice and scrapes out clean voice data pretty easily. Only problem is I am saving to disk and then passing a file location to whisper.transcribe() which obviously loads it from disk then transcribes.
I have an NVMe and I’m still getting 0.5 second transcription times, but I’d obviously like something more performant than diskspace as memory. I know there is a way to feed the audio directly to the transcribe function, but it doesn’t seem to like the format I have given it. I am getting this error RuntimeError: “reflection_pad1d” not implemented for ‘Short’.
I was wondering what the shape of the data needs to be to pass it along.
For context, here the code that is saving the data:
The idea originally was to use
audio_data_int16
as the input into transcribe. But that didn't work, so I just stuck with saving to the disk.Thanks in advance.
Beta Was this translation helpful? Give feedback.
All reactions