There is no difference between 1 example and 150 exmples in training on a pretraind model. #8556
-
I have asked a lot of questions about this. But, I still haven't solved this problem. After trained on a pre-trained model by providing 1 example, one of the dependencies of the tested sentence had changed from "ccomp" to "dep", which was not as expected as I supposed. So, I've provided more than 150 examples for it. However, there is still the case. It seems that the point of the problem is not relevant to the number of examples. If the number of examples is not enough, Why is it that just 1 example is able to affect such large a pre-trained model? Could it be that the "catastrophic forgetting" problem caused the issue? Thank you very much! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
To keep things connected, here are your other issues on this topic: #8295, #8434, #8536, #8075, #7781, #7942. As you note you have asked this repeatedly, and I'm sorry but I don't think we can give you more help than we already have on this. In order to train the parser you need a lot of examples. There is not a specific number of examples that is enough. The models are also not completely predictable, and so while we can say that more data should be better, we can't explain why something happens with 1 sample but not 150. If you want results and you aren't getting them, you should probably try getting more training data. You could also try engaging help from an experienced consultant. As you've found, it's difficult for us to give advice about specific models on the discussion forum, and I think you've reached the limit of what we can do here. Also keep in mind that we're doing our best to help everyone with their questions and give advice, and it's really not that helpful to open so many different threads about the same topic. |
Beta Was this translation helpful? Give feedback.
-
There are two sentences:
They are almost identical in structure. The only difference between them is: in sentence 1, 'where' is an adverb; in sentence 2, 'who' is a pronoun. However, using the pre-trained model 'en_core_web_sm' to analyze the two sentences didn't generate the dependencies I expected. So, I tried to adjust their dependencies by resume-training the pre-trained model 'en_core_web_sm'. For sentence 1, I provided just 1 example for training, the consequence of training was as same as I expected. For sentence 2, I provided a lot of examples, however, there was nothing change. Even I add more and more examples(now, there are 161), It is still the case. It seems immutable that even if I keep adding more examples, it probably won't help. Why are there two very different situations? Why are so many examples not helpful? After all, The model 'en_core_web_sm' is very small. |
Beta Was this translation helpful? Give feedback.
To keep things connected, here are your other issues on this topic: #8295, #8434, #8536, #8075, #7781, #7942.
As you note you have asked this repeatedly, and I'm sorry but I don't think we can give you more help than we already have on this.
In order to train the parser you need a lot of examples. There is not a specific number of examples that is enough. The models are also not completely predictable, and so while we can say that more data should be better, we can't explain why something happens with 1 sample but not 150. If you want results and you aren't getting them, you should probably try getting more training data. You could also try engaging help from an experienced consultant. As…