Finetuning transformer into TextCat #9599
-
I'm having a lot of trouble finetuning/pretraining a tok2vec or transformer layer in a pipeline with a text categorizer. I have a few variations on the pipeline configured and I've encountered different errors in different places. I hope to work through a few issues in this thread. (I'll post the first and edit more in as I finish writing them up or if/when appropriate.) TypeError: 'FullTransformerBatch' object is not iterableWhen pretraining a transformer, I get an error about transformer batches not being iterable. I assume this indicates something wrong with my configuration, but I've seen the error associated with known bugs, so I wonder if it's a spacy/transformers issue. Either way, I have no idea what this error is about. Does anyone see what is wrong below or know where else to look for mistakes? Traceback
Command and config excerpts
Info about spaCy
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 7 replies
-
The answer is that Spacy doesn't support pretraining/finetuning transformers right now, isn't it. "The impact of spacy pretrain varies, but it will usually be worth trying if you’re not using a transformer model" Darn. I suppose I'll re-pretrain my cnn tok2vec component and start a different thread (with an appropriate title) for the errors I had in that vein... If anyone has words about why this isn't supported -- my domain is highly specific and full of jargon, which I think makes it worth it to finetune a LM even with a transformer, but I would listen to reasons I'm wrong on that -- I would be interested to hear them. |
Beta Was this translation helpful? Give feedback.
The answer is that Spacy doesn't support pretraining/finetuning transformers right now, isn't it.
"The impact of spacy pretrain varies, but it will usually be worth trying if you’re not using a transformer model"
Darn. I suppose I'll re-pretrain my cnn tok2vec component and start a different thread (with an appropriate title) for the errors I had in that vein...
If anyone has words about why this isn't supported -- my domain is highly specific and full of jargon, which I think makes it worth it to finetune a LM even with a transformer, but I would listen to reasons I'm wrong on that -- I would be interested to hear them.