-
🔎 Search before asking
🐛 Bug (问题描述)目前使用的是PPV5的识别和检测模型,表格结构用的PPStructureV3 当前对于V5的识别结果比较满意,尤其是(文本识别准确率和手写文字准确率) 期望: 问题 2.使用蒸馏还是微调(对于我们的问题他们的区别是什么) 3.有没有PPV5的方案解决方案的文档(只想用V5,因为手写文字准确率太好了) 4.我尝试使用V5的文档进行了下微调(v5rec识别模型),(数据准备5000个中文语料中,以10:1掺杂了特殊字符的图片) 参考文档: yml如下(基本没有大改) 发现训练完之后的模型,手写识别的能力和以前文本识别的准确率都下降巨多(基本等于全遗忘) 5.所以希望寻求最正确的解决姿势,万分感谢~ 图1 🏃♂️ Environment (运行环境)paddleocr3.0-gpu 🌰 Minimal Reproducible Example (最小可复现问题的Demo)Global: Optimizer: Architecture: Loss: PostProcess: Metric: Train: |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 1 reply
-
数据集多少 什么样的 |
Beta Was this translation helpful? Give feedback.
-
准备了5000张左右图片如 数据集中没有手写文字的数据集合和标点符号 但是用户表格 会是 各种天花乱坠 中英文+手写字符+特殊字符所组成的数据 目前只是特殊字符识别效果不好,我应该如何准备数据集?(大概体量如何,数据集中是否要包含所有中/数字/字母/标点符号/特殊字符) 能否只提升特殊字符识别的效果,而不影响现在V5的能力呢 感谢~ |
Beta Was this translation helpful? Give feedback.
-
may you can train a model to detect the specific chars and build a classify model to deal with the new chars. |
Beta Was this translation helpful? Give feedback.
-
First of all, I am trying to understand your question.
Do you mean that you receive two results , one from the ppocrV5 and another result from a new trained model. you need to choose a better result from them.
I think you need an AI model for decision-making. I recommend Gemma or DeepseekR1. With these tools, you can add prompts to help you choose the best result for your business.
If your current situation doesn't allow you to deploy a AI model , may you need to build a word map or to train a decistion-making algorithm.
…________________________________
发件人: figoshi ***@***.***>
发送时间: 2025年6月9日 13:02
收件人: PaddlePaddle/PaddleOCR ***@***.***>
抄送: jules ***@***.***>; Comment ***@***.***>
主题: Re: [PaddlePaddle/PaddleOCR] 寻求工业领域特殊字符的解决方案 (Discussion #15568)
How can I assemble and merge the data results of the old model(ppocvV5) with the special character data results of the new model?
thanks
―
Reply to this email directly, view it on GitHub<#15568 (reply in thread)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BTB2M3MPUOI5KCDFSFKQEW33CUIN5AVCNFSM6AAAAAB6375TSWVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTGNBQGYYTCOA>.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
#15552