Replies: 1 comment 3 replies
-
16 GB is probably not enough to run two whisper large models in two separate processes. There are a few implementations that support such as https://github.com/m-bain/whisperX which supports batching using single process and would require less memory. |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have 4 A16s available, I am trying to run 2 simultaneous instances of Whisper parallelly, to speed up batch processing a large number of audio files. I am able to run 1 instance of whisper without issue. But when I try to launch another whisper instance, I get the following error:
I am also trying to diarize my audio (using pyannote-audio), which I also run in parallel. And this causes no issues and am able to spin up multiple instances without issue. This uses Torch as well.
So I am not sure what could be causing this issue. Ideally, I would like to run 3 transcription instances on 3 individual A16s. I have around 1000 audio files to transcribe. Each instance of my
23_transcribe.py
, will load a file to transcribe from my DB and transcribe it, continuously until there are no audio files left to transcribe. If I am able to run multiple23_transcribe.py
instances, it will dramatically cut down processing times.This is from
nvidia-smi
, WhereGPU0
is runningwhisper
andGPU1
is runningpyannote-audio
. When I try sending another whisper toGPU2
it crashes with the above error, But I am able to send anotherpyannote-audio
instance to it without issue.What could be limiting the number of parallel Whisper instances? Will I need to do anything more to achieve what I am trying to do?
Beta Was this translation helpful? Give feedback.
All reactions