Skip to content
Discussion options

You must be logged in to vote

We don't have a guide or example for this yet, so it'll be a little involved, but you might want to look at implementing a custom tok2vec layer. You can set your extra data as underscore attributes on the Doc and pass it into a pipeline / nlp object where the tok2vec reads those attributes and combines them with the default spaCy embeddings somehow.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by MichaelRinger
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / tok2vec Feature: Token-to-vector layer and pretraining
2 participants