Skip to content
Discussion options

You must be logged in to vote

The output is probably different because the tagger is producing a different fine-grained tag (token.tag_) for this word. This is mapped to token.pos_ and then the lemma rules are chosen based on POS. The tag->pos mapping and the lemmatizer algorithm are nearly identical in v2.3.x and v3.0.x model versions. The tag->pos mapping was updated for v3.2.x model versions.

Each exact model version (v2.3.0, v2.3.1, v3.0.0) may produce slightly different output for the same example because of model config or training differences, and although the tagger algorithm stayed the same, all the details for the config and training settings changed quite a bit from v2 to v3.

In addition, examples with more…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by adrianeboyd
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lang / en English language data and models feat / lemmatizer Feature: Rule-based and lookup lemmatization v2 spaCy v2.x
2 participants