Skip to content
Discussion options

You must be logged in to vote

This is more a question about web service backend design than a question about spaCy, so we can't be of much help here. The issue is probably that gunicorn starts 12 workers, each of which might load the model on the GPU. Furthermore, depending on when gunicorn forks workers, there may be bad interactions with threading. So you probably want to build something into your application that puts an acceptable upper bound on the number of spaCy models in GPU memory.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by Anand195
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gpu Using spaCy on GPU perf / memory Performance: memory use
2 participants