making a trained model inherit custom tokenizer #8498
Unanswered
DSLituiev
asked this question in
Help: Other Questions
Replies: 1 comment 6 replies
-
Can you provide a bit more background? How did you implement the custom tokenizer - did you register a custom one and do you see it in the config file of the model that is saved to disk by I'd be happy to look into this further, but it would be good to have some code to be able to replicate: the custom tokenizer, an example config file, and an example snippet of input text you're feeding to |
Beta Was this translation helpful? Give feedback.
6 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I am training a model that needs "8)" emoji disabled .
I use
python -m spacy train config-filled.cfg --code ./functions.py --output my-best-model-ever
. When I take that model to runprodigy ner.teach
, I see prodigy again groups "8)" into one token.How can I make sure the code is inherited in the checkpoints?
Do I have to copy it somewhere manually?
Beta Was this translation helpful? Give feedback.
All reactions