Realtime STT with Bidirectional Streaming (gRPC) Turkish #2183
Unanswered
EngincanVaran
asked this question in
Q&A
Replies: 1 comment
-
data center gpu like v100 a100 h100 are suitable for training not inference also there're many factors like connection speed, disk i/o, etc. also try out other forks like |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi all!
I want to design a bidirectional streaming gRPC with OpenAI-Whisper model (for Turkish unfortunately there are not many models). I am struggling with streaming. I have tried faster-whisper also but still they seem not fast enough. I have used a Nvidia-V100 for gpu.
All in all, I want to ask how I can make my streaming faster and more efficient? Anybody heard of the Nvidia-Triton-Server for GPU virtualization? Is there a video about it, maybe I am missing something? Can I use it to increase my transcription speed? I can use up to 4 V100 GPU's if I need to.
Beta Was this translation helpful? Give feedback.
All reactions