doc.Tensor not assigned with a transformer. #11632
-
I was under the assumption from this article that a transformer would assign doc.tensor. However for my own trained model and the sciSpacy en_core_sci_scibert, this is an empty array. After some digging i found some discussions and issues that say that you need to implement the creation of doc.tensor yourself by accessing the My confusion is that it there should be a implementation of this in spacy-transformers, because spacy's NER model works with token vectors (spacy tokenization) and not with vectors for transformers wordpieces (transformer tokenization). And when looking at this line of code from the relation extraction tutorial it seems like the transformer listener returns a array with the same column length as tokens in a doc, and not a instance of TransformerData. Which is what i get when calling So is the behaviour of the transformer listener different during training and calls to nlp() which causes the listener to return vectors aligned to spacy's tokenization? Or is there a different way to access spacy aligned transformer vectors during runtime? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Transformer data is transformed into token-aligned tensors here. I'm not sure if that fully answers your question, if not please go on asking! |
Beta Was this translation helpful? Give feedback.
Transformer data is transformed into token-aligned tensors here. I'm not sure if that fully answers your question, if not please go on asking!