Replies: 3 comments 17 replies
-
数据集规模多大?字典多大? |
Beta Was this translation helpful? Give feedback.
1 reply
-
我记得在ppocr技术报告中,ppocr训练文本识别数据是1700w左右吧。你这数据少了点
…---- 回复的原邮件 ----
| 发件人 | ***@***.***> |
| 发送日期 | 2024年06月20日 11:54 |
| 收件人 | PaddlePaddle/PaddleOCR ***@***.***> |
| 抄送人 | SWHL ***@***.***>,
Comment ***@***.***> |
| 主题 | Re: [PaddlePaddle/PaddleOCR] rec v4模型训练,acc不提升,从第5个epoch开始一直到第500个始终维持在0.2 (Discussion #13139) |
数据集标注数据1000张,合成数据1w张,字典用的ppocr默认字典,6000多个字符
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
16 replies
-
检查一下数据集吧 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
问题描述 / Problem Description
rec v4模型训练,acc和norm_edit_dis不提升,从第5个epoch开始一直到第500个,acc始终维持在0.2,norm_edit_dis维持在0.25,loss从1.6稳步下降至1.25
第五个epoch:
[2024/06/17 21:25:07] ppocr INFO: epoch: [5/500], global_step: 375, lr: 0.000500, acc: 0.162760, norm_edit_dis: 0.251667, CTCLoss: 0.419245, NRTRLoss: 1.264903, loss: 1.716696, avg_reader_cost: 0.00009 s, avg_batch_cost: 0.30337 s, avg_samples: 83.2, ips: 274.25515 samples/s, eta: 6:24:43, max_mem_reserved: 32275 MB, max_mem_allocated: 28751 MB
第490个epoch:
[2024/06/19 20:18:17] ppocr INFO: epoch: [490/500], global_step: 33810, lr: 0.000500, acc: 0.187500, norm_edit_dis: 0.310671, CTCLoss: 0.053128, NRTRLoss: 1.220689, loss: 1.272771, avg_reader_cost: 0.00020 s, avg_batch_cost: 0.59513 s, avg_samples: 156.8, ips: 263.47019 samples/s, eta: 0:07:08, max_mem_reserved: 29210 MB, max_mem_allocated: 27333 MB
[2024/06/19 20:18:19] ppocr INFO: cur metric, acc: 0.19544740819899994, norm_edit_dis: 0.2862414661770183, fps: 1205.2127416129172
best epoch:
[2024/06/19 20:18:19] ppocr INFO: best metric, acc: 0.20094191365037745, is_float16: False, norm_edit_dis: 0.22495511140089897, fps: 1212.9491793506809, best_epoch: 334
文本素材为长文本,文本示例如下:
H123_HNC_123KWH_3_WSS82_00237_V1.2.G.3_B_123789
H546_QSK_456Kwh_1_PQ_NGWS84_23469_V8.7.G.6_B_186238
用训练后的模型进行infer,发现大部分文本识别准确,但总会缺胳膊少腿,比如少一个字符,少一个.,或者少一个符号
运行环境 / Runtime Environment
config:ch_PP-OCRv4_rec.yml
Beta Was this translation helpful? Give feedback.
All reactions