Skip to content
Discussion options

You must be logged in to vote

Some of these settings have similar names (the underlying concepts are similar), but the word2vec settings for the static word vectors are completely separate from the tok2vec settings. For tok2vec, see: https://spacy.io/api/architectures/#tok2vec-arch

There are a large number of hyperparameters for word2vec and most of them influence each other, so it's hard to give simple advice. We can mainly recommend evaluating with your downstream task. (There are some similarity-related measures that can be used for intrinsic evaluation of word vectors, but they often don't correlate well with the downstream performance on other types of tasks.)

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by svlandeg
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / vectors Feature: Word vectors and similarity feat / tok2vec Feature: Token-to-vector layer and pretraining
2 participants