Realtime STT with Bidirectional Streaming (gRPC) Turkish #2183

EngincanVaran · 2024-05-17T07:53:53Z

EngincanVaran
May 17, 2024

Hi all!

I want to design a bidirectional streaming gRPC with OpenAI-Whisper model (for Turkish unfortunately there are not many models). I am struggling with streaming. I have tried faster-whisper also but still they seem not fast enough. I have used a Nvidia-V100 for gpu.

All in all, I want to ask how I can make my streaming faster and more efficient? Anybody heard of the Nvidia-Triton-Server for GPU virtualization? Is there a video about it, maybe I am missing something? Can I use it to increase my transcription speed? I can use up to 4 V100 GPU's if I need to.

phineas-pta · 2024-05-20T11:46:27Z

phineas-pta
May 20, 2024

data center gpu like v100 a100 h100 are suitable for training not inference

also there're many factors like connection speed, disk i/o, etc.

also try out other forks like whisper.cpp transformers, or other approach like onnx or tensorrt

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Realtime STT with Bidirectional Streaming (gRPC) Turkish #2183

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Realtime STT with Bidirectional Streaming (gRPC) Turkish #2183

Uh oh!

EngincanVaran May 17, 2024

Replies: 1 comment

Uh oh!

phineas-pta May 20, 2024

EngincanVaran
May 17, 2024

phineas-pta
May 20, 2024