Does spaCy require shuffling data before training? #10208
Answered
by
ljvmiranda921
kanayer
asked this question in
Help: Coding & Implementations
-
Beta Was this translation helpful? Give feedback.
Answered by
ljvmiranda921
Feb 7, 2022
Replies: 1 comment 1 reply
-
Hi @kannaricci , Ideally you want to shuffle your data to ensure that the training batches are more representative of the dataset, and that it's not dependent on some order / index. If you set |
Beta Was this translation helpful? Give feedback.
1 reply
Answer selected by
kanayer
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi @kannaricci ,
Ideally you want to shuffle your data to ensure that the training batches are more representative of the dataset, and that it's not dependent on some order / index. If you set
max_epochs>=0
during training, the training Corpus is shuffled automatically every epoch, so you don't need to worry about shuffling it by yourself :)