Spancat - help understanding training memory usage #9214
-
I'm trying to understand what is going on with a strange (or not so strange) memory issue I'm having while training the spancat portion of a model. I have a training set of 2,000 examples with 4 labels that I annotated in Prodigy. When I train on a CPU I see a major spike in memory usage then it goes down to almost nothing. When I export using I don't understand why the memory usage spikes on CPU and drops during training as well as why do I get higher memory usage when I don't use a base model? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
The base model influences the type of network architecture that is used for training, so it can certainly influence memory usage. A transformer-based model for instance will have a different memory usage than a small model. So if you're using To verify - can you share the exact commands you used to train with Prodigy, and to export from Prodigy and then train? |
Beta Was this translation helpful? Give feedback.
The base model influences the type of network architecture that is used for training, so it can certainly influence memory usage. A transformer-based model for instance will have a different memory usage than a small model. So if you're using
en_core_web_sm
as base model, the config will source components from that and the architecture will be less complex.To verify - can you share the exact commands you used to train with Prodigy, and to export from Prodigy and then train?