Spancat - help understanding training memory usage #9214

Onyoursix · 2021-09-14T19:38:16Z

Onyoursix
Sep 14, 2021

I'm trying to understand what is going on with a strange (or not so strange) memory issue I'm having while training the spancat portion of a model.

I have a training set of 2,000 examples with 4 labels that I annotated in Prodigy. When I train on a CPU I see a major spike in memory usage then it goes down to almost nothing.

When I export using data-to-spacy with no base model and train via GPU I am getting out of memory errors with a 15GB GPU. Using watch -n0.1 nvidia-smi I can see the same memory spike as I do on the CPU. However, if I export the data with a base model like en_core_web_sm the memory usage is lower during GPU training I still see a massive jump in memory usage but it only uses 13GB out of the 15GB available (it seems to stay at 13GB and doesn't drop during training).

I don't understand why the memory usage spikes on CPU and drops during training as well as why do I get higher memory usage when I don't use a base model?

Answered by svlandeg

Sep 20, 2021

The base model influences the type of network architecture that is used for training, so it can certainly influence memory usage. A transformer-based model for instance will have a different memory usage than a small model. So if you're using en_core_web_sm as base model, the config will source components from that and the architecture will be less complex.

To verify - can you share the exact commands you used to train with Prodigy, and to export from Prodigy and then train?

View full answer

svlandeg · 2021-09-20T09:23:35Z

svlandeg
Sep 20, 2021

The base model influences the type of network architecture that is used for training, so it can certainly influence memory usage. A transformer-based model for instance will have a different memory usage than a small model. So if you're using en_core_web_sm as base model, the config will source components from that and the architecture will be less complex.

To verify - can you share the exact commands you used to train with Prodigy, and to export from Prodigy and then train?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Spancat - help understanding training memory usage #9214

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Uh oh!

Spancat - help understanding training memory usage #9214

Uh oh!

Onyoursix Sep 14, 2021

Replies: 1 comment

Uh oh!

Uh oh!

svlandeg Sep 20, 2021

Onyoursix
Sep 14, 2021

svlandeg
Sep 20, 2021