Include Lemmatizer into a (trained) Transformer model #6324
Replies: 11 comments
-
|
If you're starting with a new config, just add If you have a trained model, you can install nlp.add_pipe("lemmatizer").initialize()
nlp.to_disk("/path/to/model")If you don't want the default lemmatizer config (for Swedish is may be the (untested?) rule-based lemmatizer, and maybe you want the lookup lemmatizer instead, for example), you can provide options to https://nightly.spacy.io/api/lemmatizer Edited to add |
Beta Was this translation helpful? Give feedback.
-
|
That worked out of the box, thanks a lot!
Is there some way to include my own rules or lookup mappings, other than those already in |
Beta Was this translation helpful? Give feedback.
-
|
You can use any function that returns the data in the right format (as a Lines 92 to 96 in dc816bb |
Beta Was this translation helpful? Give feedback.
-
|
@adrianeboyd Thank you for your help! |
Beta Was this translation helpful? Give feedback.
-
|
I tried to write a my own registered function together with I'm training with the command:
My Where I used the json files from And I'm getting an accuracy of Where I would expect an error I'm getting exactly the same accuracy of When replacing the json-file string by an empty string: Then I'm getting an error message: So I suppose the lookup function is beeing called with What is wrong with my custom Thanks! |
Beta Was this translation helpful? Give feedback.
-
|
If you're loading the lemmatizer with If the lemmatizer is in your config, the initialize step is run when the model is initialized before training. You need to change the |
Beta Was this translation helpful? Give feedback.
-
|
@adrianeboyd Thanks for your answer!
No, that is not the problem. This example is just to show that it doesn't make any difference what kind of json I load in my custom loader function. So my assumption is that when While when I just tested with an empty json file and I'm still getting So the problem is, that not matter what the return value of my custom loader function is, it will only consider some default data, probably provided by |
Beta Was this translation helpful? Give feedback.
-
|
Ah, I see what might be happening. It's not |
Beta Was this translation helpful? Give feedback.
-
|
I tried with: But this resulted in an error: I tried as well with: This went through without error but yielded the same result as before. Is there something more I can try? |
Beta Was this translation helpful? Give feedback.
-
|
You want something like this. It's just the lookups that are loaded by the registered function: [initialize.components]
[initialize.components.lemmatizer]
[initialize.components.lemmatizer.lookups]
@misc = "spacy.LookupsDataLoader.v1"
lang = "sv"
tables = ["lemma_lookup"]Make sure this is working from |
Beta Was this translation helpful? Give feedback.
-
|
@adrianeboyd Yes, now it works! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I'm struggling to include a lemmatizer in into my Swedish transformer model (new spacy 3.0 nightly). I'm training the model via
python -m spacy train config.cfg. The config file I'm using is basically the default transformer config created via:python -m spacy init config -p tagger,morphologizer,parser -l sv config.cfgI removed the line
lookups = nullininitializeand added ainitialize.lookups. So the last part of the config basically looks like this (rest is default):During the training I get the message:
Added vocab lookups: lemma_rulesBut the final model does not contain a lemmatizer and the
lemmaattribute still remains empty.Are there any examples on how to include a lemmatizer into a transformer model? I looked through all documentation segements I found about this. But it still remains unclear to me how include the lemmatizer into the model/package.
A practical example how to include the lemmatizer would be great!
Thanks in advance!
Beta Was this translation helpful? Give feedback.
All reactions