Skip to content
Discussion options

You must be logged in to vote

The lemmatizer also includes a table of exceptions that has precedence over the rules:

print(lemmatizer.lookups.get_table("lemma_exc")["noun"]["læret"])
# ['lære']

You can customize any of the tables in lemmatizer.lookups.tables to change the lemmatizer behavior. If you save the model with nlp.to_disk(), your changes will be preserved.

lemmatizer.lookups.get_table("lemma_exc")["noun"]["læret"] = ["whatever"]

Be aware that there's a lemmatizer cache, so you'd might not see the changes until you save and reload the model, or manually wipe out the cache:

lemmatizer.cache = {}

If you want these changes in a new model you're training from scratch, you'd want to have a custom install of spacy-…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@joakimwar
Comment options

Answer selected by joakimwar
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / lemmatizer Feature: Rule-based and lookup lemmatization
2 participants