Adding a new label to a trained Spancategorizer model #10995
-
Hello! Is there a way to introduce a new label when training the spancategorizer and while using a trained spancat model as base that did not originally include this new label? Background: I want to train a spancategoriser on 1 label first and then use that trained model as base and train a second model but this second time my train dataset has 2 labels: The original one and a second new one. Is that possible to do? If yes, how? If I just introduce a new trainset with 2 labels, spaCy complains about the new label. I have even tried to initialise the second model and the first model with both labels using a labels.json file by including the "[initialize.components.spancat.labels]" option within the config file but that didnt do the trick. I suspect what I am trying to do is not possible at the moment. Is that right? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 6 replies
-
What error do you get when you try this? There shouldn't be a problem here if you're training the model from scratch on this dataset. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the detail, that's helpful. The main issue is that only some some components are resizable, and I don't think that means you're out of options for dealing with this problem. If you could tell me a bit more about your data, maybe we can come up with something that'll work. Between your two training sets, are they the same examples in both just with different labels? Or does training set 1 contain examples that are unique from training set 2? |
Beta Was this translation helpful? Give feedback.
Thanks for the detail, that's helpful.
The main issue is that only some some components are resizable, and
SpanCategorizer
is not one of those components. You can check this with.is_resizable
.I don't think that means you're out of options for dealing with this problem. If you could tell me a bit more about your data, maybe we can come up with something that'll work. Between your two training sets, are they the same examples in both just with different labels? Or does training set 1 contain examples that are unique from training set 2?