0.9.0rc* Removed WHISPER__TTL=0 Environment?

Speaches seems to keep 2gb+ of VRAM when I use the CUDA version unless it is told to clear the model from VRAM. I went from docker tag 0.8.3-cuda-12.6.3 to  0.9.0-rc.3-cuda-12.6.3 and it seems it is ignoring `WHISPER__TTL=0`, which should unload all models from VRAM after processing is complete. This worked wonders, especially when models are small enough and that reloading them is near instant onto my GPU

Is there a new Env to do something similar? Otherwise I will go back to 0.8.3-cuda-12.6.3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.9.0rc* Removed WHISPER__TTL=0 Environment? #604

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

0.9.0rc* Removed WHISPER__TTL=0 Environment? #604

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions