Fine-tuning PaddleOCR with Custom Numbers-Only Dictionary #15178
-
I'm working on a project where I only need to recognize numerical digits (0–9). I was wondering if it's possible to fine-tune an existing PaddleOCR model using a custom dictionary that includes only numbers, in order to reduce the number of possible output classes and improve accuracy/speed.
Any insights or references would be greatly appreciated! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hey, |
Beta Was this translation helpful? Give feedback.
Hey,
Yes it is possible to restrict or retrain an existing model with a numbers-only character dictionary. You just have to train the model on your custom dict for your custom data, the recommended approach would be to confine your dict in the config file to the characters you want to recognize and finetune the model (PP-OCRv3 or PP-OCRv4) on your custom data.
This doc would help you - Finetune.md
Also you can get pre-trained models form Model List - This model supports both English and number recognition. Considering your case you can take a english pretrained model and then finetune on numerical digits(0-9).