The model is not learning #10724
-
I have a doubt, I understand that to train the model, the more text it has, (the better it will be able to learn the labels). Now, with a practical example, the model es_core_news_lg, I understand that it cannot have all the entities of type PERSON(tag) and even so if you introduce a new text it is able to identify what it really is, for example: "Maria is beautiful", Maria is a PERSON type. I now have a set of data with which I train my model (and as you know so that the model is not capable of overfitting). I test it with another set of data and the model is not able to identify the labels with which I trained it, it is not learning. And I don't know why this happens?? If you can help me understand this a bit, since I'm very new to the subject, thank you! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 5 replies
-
NER models broadly rely on two kinds of features: token features and context features. Token features are details of the labeled tokens themselves. This can be the whole token, like how "John" is likely to be a name and "the" isn't, or details - capitalized words are more likely to be names, words ending in "-son" are likely to be names. Context features are drawn from the surrounding words. So a word after "Mr" or "Miss" is likely to be a name, a word after "the" is not very likely, "my name is" is a big hint, and so on. Exactly what features are considered varies by model architecture or configuration, and how much weight each feature has, and how those weights interact, is what the model learns. For more details on this I recommend the NER chapter (presently chapter 8) in the Jurafsky and Martin book. (Note that while usually the book is very accessible, that's a pretty dense chapter, so I would recommend skimming it for the parts you're interested in.) If your model isn't working on other data it may be too different from your training data. If the text you're using has different tokens and different contexts from your training data, it may be something the model has never seen before, and it may be unable to make a prediction, in which case it defaults to not predicting. If you give more detail about the data issues you're having we may be able to help, but note that domain adaptation is just a hard problem in general, and usually the answer is simply that you need more training data. |
Beta Was this translation helpful? Give feedback.
NER models broadly rely on two kinds of features: token features and context features.
Token features are details of the labeled tokens themselves. This can be the whole token, like how "John" is likely to be a name and "the" isn't, or details - capitalized words are more likely to be names, words ending in "-son" are likely to be names.
Context features are drawn from the surrounding words. So a word after "Mr" or "Miss" is likely to be a name, a word after "the" is not very likely, "my name is" is a big hint, and so on.
Exactly what features are considered varies by model architecture or configuration, and how much weight each feature has, and how those weights interact, is what the mod…