Skip to content
Discussion options

You must be logged in to vote

It looks like you're on the right track, but for backprop you don't put the gradient in the Docs, it's just a return value of your backprop function. It might be helpful to look at this Thinc tutorial if you haven't seen it.

If you have a function that actually takes Docs as input, there is no gradient to return, because Docs are not a model that's learned, they're just input data. The gradient you have would be relative to tok2vec output, but if you're freezing your Transformer (for feature extraction) then you can just return an empty gradient. If you actually want to be able to update the Transformer, then you can just return a gradient of the same type and shape as the input to forwar…

Replies: 1 comment 5 replies

Comment options

You must be logged in to vote
5 replies
@iashaheen
Comment options

@polm
Comment options

@iashaheen
Comment options

@polm
Comment options

@iashaheen
Comment options

Answer selected by polm
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🔮 thinc spaCy's machine learning library Thinc feat / transformer Feature: Transformer
2 participants
Converted from issue

This discussion was converted from issue #11313 on August 16, 2022 05:23.