Paddleocr Finetuning pretrained model with custom data doesnt work as expected #14965
Replies: 1 comment 1 reply
-
Basierend auf deiner ausführlichen Beschreibung und dem bereitgestellten PaddleOCR-Konfigurations- und Trainingsprozess gibt es mehrere mögliche Ursachen dafür, dass dein finetuned Modell keine sinnvollen Ergebnisse liefert, obwohl das ursprüngliche Pretrained Model gut funktionierte. Hier sind die häufigsten Probleme und entsprechende Lösungsansätze:
🔗 Relevante Diskussion: Die Diskussion unter #13897 bestätigt ebenso:
📌 Zusammenfassung & Empfehlungen: ✅ Verwende dieselbe Architektur wie das Pretrained Modell (SVTR_LCNet, nicht CRNN) Wenn du diese Punkte umsetzt, solltest du bei einem Retrain mit deinen 1100 Bildern erste Verbesserungen sehen können. Mit mehr Daten (>=5000) wird das Ergebnis natürlich stabiler. Response generated by 🤖 feifei-bot | chatgpt-4o-latest |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
iam trying to get the best possible result of text recognition for my specific kind of pictures. I researched in the paddleocr documentation for recognition training and found out that I need to have 100.000+ pictures for my own custom model. I cant do this at this moment tho because I only have acces to 1100 pictures right now. Then I found out about Finetuning a pretrained model like the "en_PP-OCRv3_rec", which needs 5000+ images to get good results, but I thought with 1100 pictures I get a first impression if it works and if I get little changes in the resultating recognition. So I prepared all the data in a Folder called FineTuningOCR like this:
best_accuracy.pdparams ist the pretrained model. en_dict.txt is the dictionary containing all the symbols. In the trainData Folder the images are located and the train.txt that is in the following scheme: "[pathToPicture/image.jpg][TAB][symbols]" example: "C:/Users/Max/Documents/FineTuning/trainData/images/image.jpg D-FWFG".
My FineTuning.yml file that guides the training, looks like this:
My Hardware is a Nvidia Gtx1650 4GB VRAM GPU and I have 32GB RAM.
I tried to train it and it seemed that everything worked fine.
Training command in cmd:
I have received a best_model folder in my result/model/ folder which contained "model.pdopt" and "model.pdparams" This filetype couldnt be used as the model so I used the following command to convert:
And got these files: inference.pdiparams, inference.pdiparams.info, inference.pdmodel, inference.yml
This is the content of Inference.yml (I find it suspicious that some letters are not in quotation marks):
Then I tried this Code:
with the pretrained inference model alone I have a result of the Text shown in the image, while the new Fine tuned model recognizes nothing. I dont understand why. what did I do wrong. I thought it must perform at least as well as the pretrained model.
I also tried more other images and for every picture I tried it had no result. Even the same images I used for the training had no result in recognition. So I think there is a issue in my training process.
ty in advance for dealing with my problem and maybe helping me.
Beta Was this translation helpful? Give feedback.
All reactions