Skip to content
Discussion options

You must be logged in to vote

This is a bit of a quirk, but it's currently the intended behavior for spacy v2 and v3. The Token objects in a doc have a backoff behavior for vectors that provide the context-sensitive tensors as vectors if doc.tensor is set. doc.tensor is set by a tok2vec component in the pipeline.

If you only apply the tokenizer or use a blank model instead of en_core_web_sm, you can see what it looks like when doc.tensor is not set:

assert nlp.make_doc("text")[0].has_vector is False

If you want to know whether there is a static word vector in nlp.vocab.vectors, you can use token.is_oov or you can check the lexeme rather than the token:

# for an existing Token
nlp("word")[0].lex.has_vector
# look up th…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by czappi44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / doc Feature: Doc, Span and Token objects
2 participants
Converted from issue

This discussion was converted from issue #11163 on July 20, 2022 06:26.