Benefits of Static vectors #12444
-
I have been experimenting with different configurations to migrate v2 models to v3. With all of the new options available through config file, there's been quite a bit of experimentation. One question I have is with respect to static vectors. With or without we are using a tok2vec with multihash embedding. How do the static vectors benefit? I've gotten a sense from documentation that tok2vec with static vectors can be useful for transfer learning...If I've decided that transfer learning is impractical given my data and use-case, is there still any benefit to using? I know my models are taking significantly more memory to deploy...am I in turn benefiting from improved accuracy. My config file is copies below. Thanks. `[paths] [system] [nlp] [components] [components.ner] [components.ner.model] [components.ner.model.tok2vec] [components.tok2vec] [components.tok2vec.model] [components.tok2vec.model.embed] [components.tok2vec.model.encode] [corpora] [corpora.dev] [corpora.train] [training] [training.batcher] [training.batcher.size] [training.logger] [training.optimizer] [training.score_weights] [pretraining] [initialize] [initialize.before_init] [initialize.components] [initialize.tokenizer]` |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hey python3Berg, Its a largely empirical question whether you will see benefit from the static-vectors or not. It might be the case that your domain is really specific and transfer learning does not improve too much. It might be the case that the text that you have has a lot of unusual tokens for which the pre-trained vectors might be uninformative. For named entity recognition we published a technical report where on the data sets we experimented with the static vectors were always helpful, but especially when recognizing unseen entities i.e.: entities not present in the training set: https://arxiv.org/abs/2212.09255 |
Beta Was this translation helpful? Give feedback.
Hey python3Berg,
Its a largely empirical question whether you will see benefit from the static-vectors or not. It might be the case that your domain is really specific and transfer learning does not improve too much. It might be the case that the text that you have has a lot of unusual tokens for which the pre-trained vectors might be uninformative. For named entity recognition we published a technical report where on the data sets we experimented with the static vectors were always helpful, but especially when recognizing unseen entities i.e.: entities not present in the training set: https://arxiv.org/abs/2212.09255