Split data to perform NER #11527
-
Hi, I want to perform NER on a data. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Hi @mandar-avhad , in general, spaCy accepts the standard train / dev / test split during training and evaluation. This means that you have to split your data, preferably in a serialized DocBin format, beforehand. You can check a sample conversion script here. If you want to do stratified splits, you can implement a custom |
Beta Was this translation helpful? Give feedback.
Hi @mandar-avhad , in general, spaCy accepts the standard train / dev / test split during training and evaluation. This means that you have to split your data, preferably in a serialized DocBin format, beforehand. You can check a sample conversion script here.
If you want to do stratified splits, you can implement a custom
Corpus
that gives you the correct batch during training. You can also check out an example project that does cross-validation (note: it may not be the most efficient solution).