Skip to content
Discussion options

You must be logged in to vote

It depends, but bigger is (almost always) better. There is no magic number or way to answer this perfectly in advance, but at a minimum you would need a few hundred examples, and if possible it's better to get thousands. This flowchart may be helpful.

For more perspective, what kind of model are you trying to train, and what are you trying to label with it specifically?

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by polm
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
usage General spaCy usage
2 participants