PaddleOCR/main/version3.x/module_usage/text_recognition #15474
Replies: 41 comments 53 replies
-
请问模型微调导出后如何使用 |
Beta Was this translation helpful? Give feedback.
-
请问,蒸馏模型模型如何导出呢,导出后yml文件不存在Global模型名称,paddleocr3.0找不到模型名称对应问题,包括老师模型都不存在 |
Beta Was this translation helpful? Give feedback.
-
导出后添加,会报错,
…---- Replied Message ----
| From | ***@***.***> |
| Date | 05/29/2025 下午7:56 |
| To | PaddlePaddle/PaddleOCR ***@***.***> |
| Cc | SHOUshou0426 ***@***.***>,
Comment ***@***.***> |
| Subject | Re: [PaddlePaddle/PaddleOCR] PaddleOCR/main/version3.x/module_usage/text_recognition (Discussion #15474) |
或者直接在导出后的inference.yaml 里加上也可以
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
错误大概就是找不到模型名称,信息我是直接复制的config内的配置文件,比如en_ppocrv2_cml
…---- Replied Message ----
| From | ***@***.***> |
| Date | 05/29/2025 下午8:06 |
| To | PaddlePaddle/PaddleOCR ***@***.***> |
| Cc | SHOUshou0426 ***@***.***>,
Comment ***@***.***> |
| Subject | Re: [PaddlePaddle/PaddleOCR] PaddleOCR/main/version3.x/module_usage/text_recognition (Discussion #15474) |
具体是什么错误呢?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
我是用的是这个配置文件,进行蒸馏训练ch_PP-OCRv3_det_cml.yml
训练的时候才用的是paddle2.4版本的环境,然后发现最版本导出不了,则更新到最新的paddle 版本然后到处后没有Global 模型名称,我就将ch_PP-OCRv3_det_cml作为名称添加进行,丢错还是找不到模型名称
在 2025-05-29 20:06:47,"liuhongen1234567" ***@***.***> 写道:
具体是什么错误呢?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
目前还有个问题就是,才用旧版本的paddlepaddle 训练的ppocrv3 的英文rec 模型,在使用最新的paddle 环境导出模型,发现效果差到极点,环境版本不一致,会出现这种问题吗,还是说我用最新版本的paddle 训练ocrv5 就不会出现这个问题呢
在 2025-05-29 20:06:47,"liuhongen1234567" ***@***.***> 写道:
具体是什么错误呢?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
比如我训练pcb 的文字识别,大概有1300张左右数据英文的有必要增加v5的预训练模型吗,因为英文版的v5 rec 发现字典是中英混合的
在 2025-05-29 20:29:11,"liuhongen1234567" ***@***.***> 写道:
训练和推理时的paddle版本尽量保持一致吧。从2.4到3.0跨度有点大,可能有些算子不一样了,用最新版的paddle训练ocrv5一般不会出现问题。
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
我的问题和他很像,你可以看一下他的问题,我就知道如何解决了,
在 2025-05-29 20:26:09,"liuhongen1234567" ***@***.***> 写道:
您找,目前这个模型名称要从文本检测模型列表里选择,可以把模型名称设置成PP-OCRv4_mobile_det 试试呢?
image.png (view on web)
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
但是增加了预训练模型,我发现,v3的检测模型我才用了预训练模型,测试结果发现,他会多检测一些文字,但是大多数都是根据我标注的数据集的格式相似,10%是会出现多检测,和一个字母检测2-3遍,导致识别模型出现误检
在 2025-05-29 20:49:45,"liuhongen1234567" ***@***.***> 写道:
在很多文档场景中,中英文都是混合的,因此v5进行了一定程度的大一统,试图一个模型解决中英繁日拼音的识别。v5其实模型上没有太大的改进,主要在训练数据集上,如果是英文的话我这边更建议用v3或v4的英文模型,1300张要从头训练应该是远远不够的。一般学术界的识别模型,都要在80甚至100w上进行预训练
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
paddleocr 的数据是否有增广方式呢,比如目标检测那种可以离线增广,扩充数据集的
在 2025-05-29 20:59:10,"liuhongen1234567" ***@***.***> 写道:
好的,看起来您的场景比较垂类,可以多做几组实验看看吧。可能在您的场景里不加入预训练会更好一些
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
是否有相关的ocr 离线增广的相关demo呢或者资料
在 2025-05-29 20:59:10,"liuhongen1234567" ***@***.***> 写道:
好的,看起来您的场景比较垂类,可以多做几组实验看看吧。可能在您的场景里不加入预训练会更好一些
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
您好,这边还有个问题,训练PPOCR det 模型,模型input的size 从最初配置文件960更改为224,训练完成,使用官方给出的python 代码需要指定input_size 吗,还是说text_det_limit_side_len要指定为224
在 2025-05-29 20:59:10,"liuhongen1234567" ***@***.***> 写道:
好的,看起来您的场景比较垂类,可以多做几组实验看看吧。可能在您的场景里不加入预训练会更好一些
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
老师你好,请问微调时数据量比较小(一千多张训练图片),怎么冻结前面层,只训练后面几层?在哪里可以进行配置 |
Beta Was this translation helpful? Give feedback.
-
您好,请问识别图片中多行文字失效应该如何解决 |
Beta Was this translation helpful? Give feedback.
-
您好,PPOCRV5提出的实例代码,显示支持tensorrt ,我使用trt 推理则显示trt8的一些操作缺少AttributeError: 'tensorrt.tensorrt.IBuilderConfig' object has no attribute 'set_memory_pool_limit'
环境:
CUDA:11.2
CUDNN:8.4.1.50
TensorRT:8.0.1.6
trt 官方的samples 是没有问题的可以正常测试
以下是trt 报错,并且中间打印了一些关于算子的东西,我不确定是不是算子也有问题,下面是我的ocr实例代码里面展现了使用的模型
在 2025-06-03 17:27:57,"liuhongen1234567" ***@***.***> 写道:
您好, /usr/local/lib/python3.10/dist-packages/paddlex/inference/models/text_detection/processors.py 中的 DBPostProcess 有一个检测框过滤的操作,不知道是不是您所需要的。
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.
在 2025-06-04 17:00:20,"liuhongen1234567" ***@***.***> 写道:
您好,多行文本建议直接使用PP-OCRv5 产线 https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/pipeline_usage/OCR.html,先进行检测再进行识别。
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
你好,我遇到了两个问题: 报错如下: 如果使用命令行又可以正确运行: |
Beta Was this translation helpful? Give feedback.
-
请问用paddle升级到了3.1,还能用原本用2.6版本微调得rec模型推理么,好像会有问题 |
Beta Was this translation helpful? Give feedback.
-
mobile v5版本, 用自己数据训练后, 会出现空格缺失的问题,数据集有空格的数据,中英都有带空格的, 用paddleocr推理的,请问是什么原因? |
Beta Was this translation helpful? Give feedback.
-
您好,看起来是paddle版本的问题,目前我这边paddle3.0使用demo数据集是可以正常训练的,文本识别需要涉及到一些特殊字符,比如开始符、结束符、填充符,所以模型的类别略多于字典的字符个数是正常的。
鸿飞万里
***@***.***
…------------------ 原始邮件 ------------------
发件人: ***@***.***>;
发送时间: 2025年7月17日(星期四) 中午1:44
收件人: ***@***.***>;
抄送: ***@***.***>; ***@***.***>;
主题: Re: [PaddlePaddle/PaddleOCR] PaddleOCR/main/version3.x/module_usage/text_recognition (Discussion #15474)
wc -l ./ppocr/utils/dict/ppocrv5_dict.txt
18383 ./ppocr/utils/dict/ppocrv5_dict.txt
ppocrv5_dict.txt的字符个数是18383,而wget https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-OCRv5_server_rec_pretrained.pdparams 下载的预训练模型需要是18389的,训练的时候报错
[2025/07/17 05:43:33] ppocr INFO: train with paddle 2.5.2 and device Place(gpu:0)
[2025/07/17 05:43:33] ppocr INFO: Initialize indexes of datasets:['../00.SampleDataset/ocr_rec_dataset_examples/train.txt']
[2025/07/17 05:43:33] ppocr INFO: Initialize indexes of datasets:['../00.SampleDataset/ocr_rec_dataset_examples/val.txt']
W0717 05:43:33.289924 583 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.9, Driver API Version: 12.2, Runtime API Version: 12.0
W0717 05:43:33.290535 583 gpu_resources.cc:149] device: 0, cuDNN Version: 8.9.
[2025/07/17 05:43:33] ppocr INFO: train dataloader has 51 iters
[2025/07/17 05:43:33] ppocr INFO: valid dataloader has 17 iters
[2025/07/17 05:43:33] ppocr INFO: load pretrain successful from ./pretrain/PP-OCRv5_server_rec_pretrained
[2025/07/17 05:43:33] ppocr INFO: During the training process, after the 0th iteration, an evaluation is run every 2000 iterations
Error: ../paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion id < N failed. Id should smaller than 18389 but received an id value: 4635217965677641761.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
请问在训练过程中gpu的性能曲线是锯齿形是为什么呢? |
Beta Was this translation helpful? Give feedback.
-
Traceback (most recent call last): |
Beta Was this translation helpful? Give feedback.
-
训练rec模型,我的资料集字数较多,因此我修改yml档案如下: 训练后acc收敛至约0.93,但当我使用tools/eval.py评估时,acc永远为0,请问是哪边需要调整吗? |
Beta Was this translation helpful? Give feedback.
-
请问如何导出onnx模型呢? |
Beta Was this translation helpful? Give feedback.
-
现在需要识别的中英文文本中,会大量存在空格没有识别到的情况。然后我重新标注了空格没有识别到的文本,数据集中大概2900张图片,训练后空格问题解决了,但是又会存在文本检测错误或者漏检的问题,请问这是什么原因造成的呢? |
Beta Was this translation helpful? Give feedback.
-
请问PP-OCRv5模型如何使其只识别水平方向的文本,现在的文本都是水平方向的文本,但存在一些上下连着很近的文本,模型将垂直方向的两个文本识别为一个文本。请问如何调整使模型不识别垂直方向的文本? |
Beta Was this translation helpful? Give feedback.
-
下载PP-OCRv5_server_rec本地调用,怎么制定本地的字典? |
Beta Was this translation helpful? Give feedback.
-
请问一下我报错FileNotFoundError: [Errno 2] No such file or directory: './train_data/train_list.txt'是什么情况呢,我下载的ocr_rec_dataset_examples.tar里面也没有这个文件呢。具体日志如下: |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
现在OVRv5的识别模型,为何微调训练后的模型识别精度严重下降,之前可以识别的文本现在都会识别错误,是要进行什么冻结操作吗? Optimizer: Architecture:
Loss:
PostProcess: Metric: Train:
这是我现在的.yaml文件,现在微调是为了解决模型识别不到数字与字母之间空格的问题,训练后发现空格可以识别到了,但是现在模型检测识别原先的文本会出现很多错误。 |
Beta Was this translation helpful? Give feedback.
-
静态图模型可以直接集成到 PaddleOCR 的 API 中,这一步具体咱实现? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
PaddleOCR/main/version3.x/module_usage/text_recognition
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/main/version3.x/module_usage/text_recognition.html
Beta Was this translation helpful? Give feedback.
All reactions