Understanding training output for textcat_multilabel - steps vs epochs #10343
-
Question original asked in prodigy support. I'm trying to understand the training output for As I understand then an epoch means one iteration over all of the training data. In my case that's 24.404 documents. The training part of my config looks like this
As I understand Now I started wondering about this because I saw that at step 16800 I reached the next epoch, which leaves me with an average batch size of 24404 / 16800 = ~1.45. Is that right? In general my documents are pretty big but performance are good so I don't need to chop it into smaller docs and average over. But maybe I could benefit from fiddling with the batching strategy. Any comments on that? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
That is correct.
I'm not sure what you mean. If your units are "whole passes over the training data", then batch size is 1 by definition. Note that if you have batch size 1000 in literal docs, you'll see output like this:
If you interpret this as meaning that the first epoch was actually 900 iterations, that would be wrong. It might help to think of the iterations column as "iterations finished".
The main reason to adjust batch size is for a training speed / memory use tradeoff. There's a lot of research into which batch sizes are better, but in the majority of cases I would expect minimal effects on accuracy after changing batch size. You can always try and see though. |
Beta Was this translation helpful? Give feedback.
That is correct.
I'm not sure what you mean. If your units are "whole passes over the training data", then batch size is 1 by definition.
N…