Finetuning transformer into TextCat #9599
-
|
I'm having a lot of trouble finetuning/pretraining a tok2vec or transformer layer in a pipeline with a text categorizer. I have a few variations on the pipeline configured and I've encountered different errors in different places. I hope to work through a few issues in this thread. (I'll post the first and edit more in as I finish writing them up or if/when appropriate.) TypeError: 'FullTransformerBatch' object is not iterableWhen pretraining a transformer, I get an error about transformer batches not being iterable. I assume this indicates something wrong with my configuration, but I've seen the error associated with known bugs, so I wonder if it's a spacy/transformers issue. Either way, I have no idea what this error is about. Does anyone see what is wrong below or know where else to look for mistakes? TracebackCommand and config excerpts
Info about spaCy
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 7 replies
-
|
The answer is that Spacy doesn't support pretraining/finetuning transformers right now, isn't it. "The impact of spacy pretrain varies, but it will usually be worth trying if you’re not using a transformer model" Darn. I suppose I'll re-pretrain my cnn tok2vec component and start a different thread (with an appropriate title) for the errors I had in that vein... If anyone has words about why this isn't supported -- my domain is highly specific and full of jargon, which I think makes it worth it to finetune a LM even with a transformer, but I would listen to reasons I'm wrong on that -- I would be interested to hear them. |
Beta Was this translation helpful? Give feedback.
The answer is that Spacy doesn't support pretraining/finetuning transformers right now, isn't it.
"The impact of spacy pretrain varies, but it will usually be worth trying if you’re not using a transformer model"
Darn. I suppose I'll re-pretrain my cnn tok2vec component and start a different thread (with an appropriate title) for the errors I had in that vein...
If anyone has words about why this isn't supported -- my domain is highly specific and full of jargon, which I think makes it worth it to finetune a LM even with a transformer, but I would listen to reasons I'm wrong on that -- I would be interested to hear them.