Custom trained NER tags differently for the same entity #10101
-
Hello. So I have trained a custom NER using spacy on a small dataset to detect custom entities. I have 80 training data files and 20 test data files. It is a basic model without any hyper parameter tuning for the time being till I get the complete pipeline in place. While detecting the entity during test time, In one test example where the entity to be detected appears twice, it tags it correctly once and incorrectly the second time. The word is tagged as
which is the correct one and in another instance in the sentence when the word appears, it is tagged as
Will this be fixed with more training data?How much data would be sufficient inorder to obtain a decent model with good recognition? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hello, I think that 80 training examples are a bit low and increasing your dataset would much likely result in better prediction accuracy. |
Beta Was this translation helpful? Give feedback.
Hello,
In terms of how much data you need to get good results, there's no discrete answer and relies heavily on the type of the data. However, there are guidelines that roughly set some thresholds, we've made a flow chart for prodigy annotations which rules you can also apply to any other ML tasks.
I think that 80 training examples are a bit low and increasing your dataset would much likely result in better prediction accuracy.