Improving Text Classifier Accuracy #9996
-
We are working on a text classification problem where we are using Spacy text categorizer spacy.TextCatEnsemble.v2 - https://spacy.io/api/architectures#textcat . The model was run using the spacy English model en_core_web_sm. It uses a stacked ensemble of a linear bag-of-words model and a neural network model. The neural network is built upon a Tok2Vec layer and uses attention. Our goal is to improve the accuracy of the classifier and we believe that replacing the spacy.TextCatBOW.v2 with a modern word embedding technique is the way to go. But we cannot find any way to do the same. We found a way to swap the model layers i.e. https://spacy.io/usage/layers-architectures#swap-architectures but it's not stating anywhere about swapping TextCatBOW with a modern technique. Please advise on the above. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
You can just rewrite your config to use a different architecture in the textcat component (updating other parameters as necessary, or leaving blank and using
The docs there describe the process of swapping architectures exactly, what about them is not what you're looking for? The example is changing an ensemble to BOW, but you can just change your BOW to something else. |
Beta Was this translation helpful? Give feedback.
You can just rewrite your config to use a different architecture in the textcat component (updating other parameters as necessary, or leaving blank and using
fill-config
) and then retrain your model. You can use TextCatCNN for example.The docs there describe the process of swapping architectures exactly, what about them is not what you're looking for? The example is changing an ensemble to BOW, but you can just change your BOW to something else.