if en_core_wem_sm
model don't have word vectors how it can do other things?
#10721
-
hi, I am confuse about if I think other people also confused about word2vect in the model or how it was used.
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Hi @Oscarjia , The
So to answer your question, the |
Beta Was this translation helpful? Give feedback.
Hi @Oscarjia ,
The
en_core_web_sm
does have a tok2vec component. Thesm
tok2vec is based on token features like NORM, PREFIX, SUFFIX, etc. whereas themd
/lg
has those features plus an external static word vector concatenated into it. To further clarify:tok2vec
points to some vector for a token, not the same vectors as "static word vectors."sm
/md
/lg
), thetok2vec
component produces context-sensitive tensors that are stored inDoc.tensor
So to answer your question, the
sm
model does have a tok2vec component based on token features, that's why it can also do those downstream tasks (POS, etc.) and why it has an option in the config file.