Efficiently use nvidia A2 (16 GB) gpu for inference. #12414

Anand195 · 2023-03-14T05:46:52Z

Anand195
Mar 14, 2023

Hello, there

I have trained spacy model with custom dataset and it's working fine for inference (or prediction task) on CPU.

Now I 'm looking to upgrade to GPU, Here I'm facing an issue related to GPU memory overload.

I have created a django project in which I use this spacy model, and deployed the Django project using gunicorn and nginx

here is my gunicorn config

[Unit]
Description=gunicorn daemon
After=network.target

[Service]
User=superadmin
Group=superadmin

WorkingDirectory=/home/superadmin/objDtct
ExecStart=/home/superadmin/objDtct/bin/gunicorn --workers 12 --thread 4 --bind 192.168.1.232:5000 --timeout 300 --preload cv_parser_api.wsgi:application
Restart=on-failure
[Install]
WantedBy=multi-user.target

For every new request on server the model loads in Gpu and it's memory consumption increases till it's get overloaded.

How can I overcome this issue??

Answered by danieldk

Mar 14, 2023

This is more a question about web service backend design than a question about spaCy, so we can't be of much help here. The issue is probably that gunicorn starts 12 workers, each of which might load the model on the GPU. Furthermore, depending on when gunicorn forks workers, there may be bad interactions with threading. So you probably want to build something into your application that puts an acceptable upper bound on the number of spaCy models in GPU memory.

View full answer

danieldk · 2023-03-14T13:52:47Z

danieldk
Mar 14, 2023

This is more a question about web service backend design than a question about spaCy, so we can't be of much help here. The issue is probably that gunicorn starts 12 workers, each of which might load the model on the GPU. Furthermore, depending on when gunicorn forks workers, there may be bad interactions with threading. So you probably want to build something into your application that puts an acceptable upper bound on the number of spaCy models in GPU memory.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Efficiently use nvidia A2 (16 GB) gpu for inference. #12414

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Efficiently use nvidia A2 (16 GB) gpu for inference. #12414

Uh oh!

Anand195 Mar 14, 2023

Replies: 1 comment

Uh oh!

danieldk Mar 14, 2023

Anand195
Mar 14, 2023

danieldk
Mar 14, 2023