Document-Level Embeddings in Transformer Model #11715
Unanswered
sunnyifan
asked this question in
Help: Model Advice
Replies: 1 comment
-
Just double-checking: you're aware of how longer texts are split into overlapping strided spans with the span getter (https://spacy.io/api/transformer#span_getters)? So your With the default |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
While using a Transformer-based model like
en_core_web_trf
, two tensors are being exposed fromtrf_data
:num_docs * num_tokens * hidden_size
;num_docs * hidden_size
.How should we interpret the per-document embeddings? From the model structure of RoBERTa, it's likely that the per-document embeddings are the last layer embeddings of
[CLS]
passed thru a linear layer and then tanh. Was this final Linear-tanh layer trained for a specific task?Beta Was this translation helpful? Give feedback.
All reactions