TextCat Training Error on Custom Preprocessed Dataset #12357
-
Hi I'm new to machine learning and have been working with a dataset that I annotated using Prodigy. I trained a model using the CLI model training from Prodigy and everything ran smoothly. However, I recently attempted to preprocess the dataset by applying some additional steps that altered the data. While there were no issues saving the preprocessed data to the Prodigy database, I encountered errors when trying to train the model using the following command: python -m prodigy train ./training/spancat/test --spancat test --eval-split 0.25 The error message I received was:
I've attached links to the annotated and preprocessed dataset samples for reference. I'm hoping to get some advice on how to resolve these errors and improve the performance of my model with the preprocessed data. Any insights or guidance would be greatly appreciated. Thanks in advance for your help! Annotated data: https://gist.github.com/daffahilmyf/77cbd546f28070ca27048a7f0d88d1ed |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
We've seen this error in the past when there was a bug related to docs without any suggestions, but this should be fixed in spacy v3.3.2 and v3.4.4. Can you double-check which version of spacy you are using ( |
Beta Was this translation helpful? Give feedback.
We've seen this error in the past when there was a bug related to docs without any suggestions, but this should be fixed in spacy v3.3.2 and v3.4.4. Can you double-check which version of spacy you are using (
spacy info
)?