Fine-tune v3 detection model on handwriting images #14868
Replies: 1 comment 1 reply
-
To fine-tune the ch_PP-OCRv3_det model for handwriting detection, you need to adjust hyperparameters to better suit your dataset of grayscale handwriting images. Based on your dataset's characteristics (small-sized grayscale images with a consistent height), the following modifications are recommended: 1. Adjust Image Processing and AugmentationSince your images are grayscale, ensure that the model correctly handles single-channel images:
2. Optimize Learning Rate and Training ScheduleYou have reduced the learning rate from 0.001 to 0.0001, which might be too low initially. Possible alternative:
3. Fix Pre-Trained Model PathYour current config refers to pretrained_model: ./pretrained_models/ch_PP-OCRv3_det_train/best_accuracy If you are getting unexpected behavior, double-check the log output to confirm the pretrained weights are correctly loaded. 4. Tweak Loss Function
5. Modify Input Size for Small Text RegionsYour dataset contains small handwriting samples (~78-258 pixels in width). To improve training:
6. Experiment with Batch Size
Next Steps
These optimizations should help the detector adapt better to your handwriting dataset. Let me know if you encounter any issues! 🚀 Response generated by 🤖 feifei-bot | chatgpt-4o-latest |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi Paddle Commnuity,
I am going to fine tune a detection model by using ch_PP-OCRv3_det_cml.yml as the config and ch_PP-OCRv3_det_slim as the pretrain model. I have built my own dataset which mainly is grayscale handwriting images with the width ranged from 78 -258 pixel and consistent height about 42 pixel. I have balanced the image batch size and make the total dataset about 1000 images.
Which hyper-parameters in the config file should I modify to suit my case? I have already fine-tuned once and mainly changed the epoch to 100 and the learning rate from 0.001 to 0.0001, but the inference result is just similar to the original pretrained model.
Here is config:
Beta Was this translation helpful? Give feedback.
All reactions