Skip to content
Discussion options

You must be logged in to vote

I came away with the impression that when defining a training pipeline, the components listed in nlp.pipeline should really only be the components one intends on training. I think this may have been a wrong conclusion on my part.

It is basically correct that the components in nlp.pipeline should only be the ones you are interested in training. However, there is a wrinkle to this.

When you train a statistical model, it needs a source of features. In spaCy pipelines that's going to be a tok2vec or Transformer (one exception, see next line). When you train a model, it's usually better to train the feature source with it at the same time. So, in your case, it would make sense to train a tok…

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@kudryk
Comment options

@polm
Comment options

@kudryk
Comment options

Answer selected by kudryk
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
training Training and updating models feat / textcat Feature: Text Classifier
2 participants