Cuda out of memory on Large-v2 using Tesla T4 #1319
-
Hey everyone, I am running a Gunicorn server with Tesla T4 gpu (16gb of vram) and I am getting the following error while the gpu is being initialized on startup
I tried clearing torch cache with |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Can you check if there're any other processes occupying the VRAM? You can run Also, please make sure the model is not getting loaded multiple times, which can happen if |
Beta Was this translation helpful? Give feedback.
Can you check if there're any other processes occupying the VRAM? You can run
nvidia-smi
to check the memory usage before starting the server.Also, please make sure the model is not getting loaded multiple times, which can happen if
load_model()
is called multiple times, potentially from different processes (workers) that gunicorn server may launch.