Help on Training SpanCat or NER model using Transformer #9817
-
I am trying to apply the Span Categorizer and/or NER architecture (https://github.com/explosion/projects/tree/v3/experimental/ner_spancat) into adverbial disambiguation task in English (adverbs of manner, temporal, etc.). I was able to successfully replicate the original SpanCategorizer project in Indonesian. I was also able to swap the dataset and language (that is, into English adverbial disambiguation task), and preliminarily train a model using Tok2Vec architecture. Then, I have been stuck in swapping the tok2vec to transformer architecture. Below, I describe the overall plan and specific problems I have. I appreciate if you could take a look at the .cfg file below and see if there is any errors. Project plan and progressGoal: Applying ner OR SpanCategorizer architecture into English Adverbial disambiguation task.
IssueThe issue is that when I train the model using transformer models, I get 0s for evaluation metrics. I think this may mean that the no training or prediction is made (due to errors in cfg file). I would like to have help on figuring out the source of this issue and potential solutions.
Spacy versions
Config file for TransformerThe following config file may contain errors, which results in the aforementioned 0s in the eval metrics. I appreciate any suggestions on how to make this work!!
Config file for tok2vec (for comparison purpose)This config worked fine. The performance was not ideal, but it successfully trained the model.
Result of tok2vec model (To show it worked on tok2vec model)
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 5 replies
-
It looks like your GPU isn't configured correctly. It's technically possible to train a Transformer on CPU, but it will take an extremely long time, and is not recommended. Are you training on CPU intentionally? I wouldn't expect it to train without error but produce meaningless results like this, but I wanted to check. |
Beta Was this translation helpful? Give feedback.
It looks like your GPU isn't configured correctly. It's technically possible to train a Transformer on CPU, but it will take an extremely long time, and is not recommended. Are you training on CPU intentionally? I wouldn't expect it to train without error but produce meaningless results like this, but I wanted to check.