Replies: 2 comments 1 reply
-
unfortunately CLI is a kind of one-time use, you have to find a way to use python as your backend |
Beta Was this translation helpful? Give feedback.
0 replies
-
Since I am not familiar with Python, I would appreciate any advice you might have to offer in the context of memory management ( RAM and CUDA based GPU ) to the first timer. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I am new to the world of the AI and models and this question might sound stupid so apologies for that in advance.
I have noticed that on a GPU enabled system the Whisper performs reasonably well in terms of the actual transcription process. However, most of the time is taken by it in the model initialisation step where it gets loaded into the memory ( in GPU and RAM ). I have heard about the batch processing but that is not suitable to my architecture for now as it requires fundamental changes which is not ideal in the bigger picture.
I was wondering if there is a way we can keep the model cached into the memory in order to save myself from the initialisation cost. To keep the perspective, I am using the model into a JavaScript container using its CLI.
Beta Was this translation helpful? Give feedback.
All reactions