Spacy "spancat" not training (possible config mistake) #10562
-
Hi, Disclaimer: Situation: Issue: Data:
where signature would be the span from doc containing the signature and message is the span containing the rest of the message. I think there's some redundancy in how I've set it up above, but I have attempted this in a number of ways... EDIT: I didn't make an Entity Ruler and then save the entities to doc.spans. Is this a necessary step? Are my spans perhaps missing something and that's why they're not being used in the training? Config:
Output:
After which it just returns me to command line. Spacy Info:
Afterword: |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 12 replies
-
Hello 😄 Thanks for the detailed description! If it fails/hangs silently, one thing that you could do to further debug this is to run the training repeatedly and kill it with ctrl+c. The traceback will show what it was doing. If you do that a few times and it's always in the same place you can be confident that's where it's spending time processing. I also saw that your
You could also enable the
|
Beta Was this translation helpful? Give feedback.
Hello 😄 Thanks for the detailed description!
At first glance, I don't see anything wrong with the config so my first guess is that either something is wrong with the data or you're running out of memory.
If it fails/hangs silently, one thing that you could do to further debug this is to run the training repeatedly and kill it with ctrl+c. The traceback will show what it was doing. If you do that a few times and it's always in the same place you can be confident that's where it's spending time processing.
I also saw that your
batch_size
in[nlp]
is set to 1000, have you tried to lower that number?And to spare you the many sizes in the suggester you can also use this: