Get all the tokens' vector without loop #12667
-
I want to ask a question. Now I have a very big document. I need to get every word's vector. But I can only do it through this: vectors = [token. vector for sent in nlp(doc) for token in sent], it needs a lot of time. So I want to ask if there is a faster way to achieve this. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
We'd be able to help better if you gave us some more information about what exactly you were trying to do with this. The contexualised |
Beta Was this translation helpful? Give feedback.
You should not be doing this - the vectors that this pipeline generates are trained to return representations that are a) highly conditioned on their context, and b) specific to some downstream task; it's not going to be very meaningful to run similarity tests on them.