Skip to content
Discussion options

You must be logged in to vote

Can you check if there're any other processes occupying the VRAM? You can run nvidia-smi to check the memory usage before starting the server.

Also, please make sure the model is not getting loaded multiple times, which can happen if load_model() is called multiple times, potentially from different processes (workers) that gunicorn server may launch.

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@HalukMaestra
Comment options

@jongwook
Comment options

Answer selected by jongwook
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants