Skip to content
Discussion options

You must be logged in to vote

Are you possibly using the v3.0.0 model rather than the v3.1.0 model (look at the version for en_core_web_trf in pip freeze and/or run spacy validate)? If so, I think you've just lucked into an example that doesn't work well for this model, which is primarily trained on newspaper-style text with standard capitalization. See more details about the statistical models in #3052.

The en_core_web_trf v3.1.0 model has some lowercase augmentation that should improve the performance on texts without newspaper-style capitalization. With spacy v3.1.1 and en_core_web_trf v3.1.0, both versions of "Richard" are shown as PERSON for me:

import spacy
from spacy import displacy

nlp = spacy.load("en_core_w…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by svlandeg
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lang / en English language data and models models Issues related to the statistical models perf / accuracy Performance: accuracy
2 participants
Converted from issue

This discussion was converted from issue #8978 on August 17, 2021 16:16.