PaddleOCR/main/en/version3.x/module_usage/text_recognition #15712

2025-06-13T01:28:06Z

giscus[bot]
bot Jun 13, 2025

PaddleOCR/main/en/version3.x/module_usage/text_recognition

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

https://paddlepaddle.github.io/PaddleOCR/main/en/version3.x/module_usage/text_recognition.html

DragonPow · 2025-06-13T01:28:07Z

DragonPow
Jun 13, 2025 — with giscus

I want training model with new language, have a tutorial step-by-step to do it?

2 replies

liuhongen1234567 Jun 13, 2025
Collaborator

Hello, you can refer to the OCR data preparation documentation to annotate data in the new language, and then refer to the text recognition documentation for training, evaluation, and export.

liuhongen1234567 Jun 13, 2025
Collaborator

Alternatively, if you already have cropped recognition images and recognition labels and do not plan to train the text detection module, you can refer to this document to prepare the recognition data.

pluszerominus · 2025-06-27T13:03:44Z

pluszerominus
Jun 27, 2025 — with giscus

Good afternoon, I have completed the PP-OCRv4_server_rec model, exported it to the inference format, and when trying to use the model through the python API (PaddleOCR method), the error "AssertionError: Model name mismatch, please input the correct model dir.." Why can such an error occur? The PP-OCRv4_server_rec model is specified in the yml configuration files.

2 replies

liuhongen1234567 Jun 27, 2025
Collaborator

Hello, this is because the default model for PaddleOCR 3.0 is PP-OCRv5_server_rec. For other models, in addition to modifying text_recognition_model_dir, you also need to modify text_recognition_model_name. The correct code is as follows:

from paddleocr import PaddleOCR

ocr = PaddleOCR(
     text_recognition_model_name="PP-OCRv4_server_rec",
     text_recognition_model_dir="export_model_path",
     use_doc_orientation_classify=False,
     use_doc_unwarping=False,
     use_textline_orientation=False,
 ) 
result = ocr.predict("./general_ocr_002.png")
for res in result:
    res.print()
    res.save_to_img("output")
    res.save_to_json("output")

pluszerominus Jul 1, 2025

Thanks a lot, it worked

tieupham-ltp · 2025-07-04T17:04:27Z

tieupham-ltp
Jul 4, 2025 — with giscus

Hello, I have trained with my dataset and my custom dictionary file, specifically vi_dict in PP-OCRv5_server_rec, but I couldn't find a way to pass the path to my dictionary file during prediction, like the rec_char_dict_path parameter in OCRv4. How can I use my dictionary file for prediction?
Thank you.

1 reply

liuhongen1234567 Jul 5, 2025
Collaborator

Hello, this is a new feature of PaddleOCR 3.0. The dictionary used during training is now stored in the inference.yaml file of the exported model. Therefore, during inference, it will be read from inference.yaml, and there’s no need to set it separately anymore.

zhangyubo0722 · 2025-07-08T08:40:51Z

zhangyubo0722
Jul 8, 2025
Collaborator

PaddleOCR 3.1 has been released, featuring the new PP-OCRv5 multilingual text recognition model. It supports 37 languages including French, Spanish, Portuguese, Russian, Korean, and more, achieving an average recognition accuracy improvement of over 30%. Welcome to try it out!

0 replies

liuhongen1234567 · 2025-07-16T08:03:33Z

liuhongen1234567
Jul 16, 2025
Collaborator

Hello, the training data for PaddleOCR is internal and will not be publicly available. If you need to use it, you can upload the data to the OCR no-code pipeline in AI Studio , where there is a fusion factor parameter that allows for training by mixing original data.   鸿飞万里 ***@***.***  

…

------------------ 原始邮件 ------------------ 发件人: "Albert Nathanael ***@***.***>; 发送时间: 2025年7月16日(星期三) 中午11:11 收件人: ***@***.***>; 抄送: ***@***.***>; ***@***.***>; 主题: Re: [PaddlePaddle/PaddleOCR] PaddleOCR/main/en/version3.x/module_usage/text_recognition (Discussion #15712) Hi PaddleOCR team, I want to fine-tune the PP-OCRv5 mobile recognition model with my own scanned document dataset. Is the original training data list (e.g., train_data.txt and images/labels) for PP-OCRv5 mobile-rec publicly available? If not, can you recommend public datasets similar to the ones you used for the Latin (English/Indonesian) model? Thank you! — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: ***@***.***>

1 reply

Diversi0n Jul 18, 2025 — with giscus

Hello, thanks for the clarification!

I’m new to PaddleOCR and very interested in fine-tuning the PP-OCRv5 mobile model using my own document dataset. Could you kindly share a tutorial on how to train using AI Studio no-code pipeline with the fusion factor? I’d like to make sure the model’s original knowledge is preserved and strengthened by my data.

Appreciate your help!

Le-Bao-181003 · 2025-07-16T08:24:09Z

Le-Bao-181003
Jul 16, 2025 — with giscus

I’m currently fine-tuning the PP-OCRv5_mobile_rec model on a custom Japanese dataset that includes both handwritten and printed text. I followed the official PaddleOCR pipeline for training and successfully exported the inference model, which generated the following files in my output folder:
inference/
├── inference.pdiparams
├── inference.pdmodel
├── inference.yml

I then attempted to load this fine-tuned model using the following code:

import cv2
from paddleocr import PaddleOCR
import time

recognition_model = 'PP-OCRv5_mobile_rec_finetune_infer'

ocr = PaddleOCR(
    text_detection_model_name="PP-OCRv5_mobile_det",
    text_recognition_model_name="PP-OCRv5_server_rec",
    text_recognition_model_dir=recognition_model,
)

img_path = "images/crop/Screenshot_2025-07-03_crop_6.jpg"
result = ocr.predict(img_path)

However, I encountered the following error:

ValueError: (InvalidArgument) Type of attribute: strides is not right.
[Hint: Expected attributes.at("strides").dyn_cast<pir::ArrayAttribute>().at(i).isa<pir::Int32Attribute>() == true, but received attributes.at("strides").dyn_cast<pir::ArrayAttribute>().at(i).isa<pir::Int32Attribute>():0 != true:1.] (at paddle\fluid\pir\dialect\operator\ir\pd_op3.cc:24692)

Interestingly, if I replace the model folder with the official inference_model (e.g., the original downloaded version), it works fine.

Could you please advise what might be causing this issue? Is there any incompatibility between the export format and the runtime, or any specific config I need to adjust after fine-tuning?
Thank you so much for your help!

1 reply

liuhongen1234567 Jul 17, 2025
Collaborator

Hello, could you provide more details about the environment, for example, the versions of PaddlePaddle, PaddleOCR, and PaddleX?

budian92 · 2025-07-18T03:42:48Z

budian92
Jul 18, 2025 — with giscus

文字识别示例代码中output = model.predict(input="general_ocr_rec_001.png", batch_size=1) 没有关于文本框参数的，需要将det模块识别的文本框结果作为参数传入，参数列表没有这个文本框参数，不知道如何传入

4 replies

liuhongen1234567 Jul 18, 2025
Collaborator

您好，这里返回的output是个生成器，需要 for循环遍历才会执行，这里 dt_polys 就是文本框的参数。

示例代码如下：

from paddleocr import TextDetection
model = TextDetection(model_name="PP-OCRv5_server_det")
output = model.predict("general_ocr_001.png", batch_size=1)
for res in output:
    bbox = res["dt_polys"]

budian92 Jul 21, 2025 — with giscus

这里可能我没表述清楚，现在我先通过TextDetection模块进行文字区域检测，获取对应的dt_polys,会针对dt_polys做些人工检验过滤一些异常后，将dt_polys作为参数传入给TextRecognition进行文字内容识别，这一步文档没说明如何将dt_polys传参给TextRecognition，我试了model = TextDetection(model_name="PP-OCRv5_server_det")
output = model.predict("general_ocr_001.png", batch_size=1，dt_polys=xxxx)，或报错，显示predict无dt_polys参数，这里不清楚如何传参，我需要将文字检测和文字识别作为两步分别处理

budian92 Jul 21, 2025 — with giscus

上面最后描述打错了：我试了model = TextDetection(model_name="PP-OCRv5_server_det")
output = model.predict("general_ocr_001.png", batch_size=1，dt_polys=xxxx)打错了，应该是model = TextRecognition(model_name="PP-OCRv5_server_det"),在文字识别时不知道如何传参已获取的dt_polys进行文字框内文字内容识别

liuhongen1234567 Jul 21, 2025
Collaborator

您好，目前文本识别模型不支持直接传入坐标，可以根据 dt_polys 从原始图像中裁剪出识别区域，再通过input参数直接将识别图片传进去，目前OCR产线也是这么实现的。具体逻辑可以参考 https://github.com/PaddlePaddle/PaddleX/blob/a8ec0bdec40d4108859ebda48e079d2fdcfb5b82/paddlex/inference/pipelines/ocr/pipeline.py#L280

NikitinaMaria · 2025-08-07T11:17:46Z

NikitinaMaria
Aug 7, 2025 — with giscus

Hello. I have trained PP-OCRv5_mobile_rec with my dataset and my custom dictionary file. Then I exported the model using:

python tools/export_model.py -c configs/rec/rec_new.yml \
-o Global.pretrained_model=output/rec_new_v1/latest \
Global.save_inference_dir=output/rec_new_inference \
Global.inference_model=True

But the recognition result differs between uses predict_rec.py and python API:

python3 tools/infer/predict_rec.py --image_dir={img_path} --rec_model_dir=output/rec_new_inference --rec_char_dict_path=new_dict.txt

And

from paddleocr import PaddleOCR

ocr = PaddleOCR(
    text_recognition_model_name="PP-OCRv5_mobile_rec",
    text_recognition_model_dir="rec_new_inference",
    use_doc_orientation_classify=False,
    use_doc_unwarping=False,
    use_textline_orientation=False,
)
result = ocr.predict(img)

Can you tell me what could have caused this problem?

0 replies

PaddleOCR/main/en/version3.x/module_usage/text_recognition #15712

Uh oh!

giscus[bot] bot Jun 13, 2025

PaddleOCR/main/en/version3.x/module_usage/text_recognition

Replies: 8 comments · 11 replies

Uh oh!

DragonPow Jun 13, 2025 — with giscus

Uh oh!

liuhongen1234567 Jun 13, 2025 Collaborator

Uh oh!

liuhongen1234567 Jun 13, 2025 Collaborator

Uh oh!

pluszerominus Jun 27, 2025 — with giscus

Uh oh!

liuhongen1234567 Jun 27, 2025 Collaborator

Uh oh!

pluszerominus Jul 1, 2025

Uh oh!

tieupham-ltp Jul 4, 2025 — with giscus

Uh oh!

liuhongen1234567 Jul 5, 2025 Collaborator

Uh oh!

zhangyubo0722 Jul 8, 2025 Collaborator

Uh oh!

liuhongen1234567 Jul 16, 2025 Collaborator

Uh oh!

Diversi0n Jul 18, 2025 — with giscus

Uh oh!

Le-Bao-181003 Jul 16, 2025 — with giscus

Uh oh!

liuhongen1234567 Jul 17, 2025 Collaborator

Uh oh!

budian92 Jul 18, 2025 — with giscus

Uh oh!

liuhongen1234567 Jul 18, 2025 Collaborator

Uh oh!

budian92 Jul 21, 2025 — with giscus

Uh oh!

budian92 Jul 21, 2025 — with giscus

Uh oh!

liuhongen1234567 Jul 21, 2025 Collaborator

Uh oh!

NikitinaMaria Aug 7, 2025 — with giscus

giscus[bot]
bot Jun 13, 2025

Replies: 8 comments 11 replies

DragonPow
Jun 13, 2025 — with giscus

liuhongen1234567 Jun 13, 2025
Collaborator

liuhongen1234567 Jun 13, 2025
Collaborator

pluszerominus
Jun 27, 2025 — with giscus

liuhongen1234567 Jun 27, 2025
Collaborator

tieupham-ltp
Jul 4, 2025 — with giscus

liuhongen1234567 Jul 5, 2025
Collaborator

zhangyubo0722
Jul 8, 2025
Collaborator

liuhongen1234567
Jul 16, 2025
Collaborator

Le-Bao-181003
Jul 16, 2025 — with giscus

liuhongen1234567 Jul 17, 2025
Collaborator

budian92
Jul 18, 2025 — with giscus

liuhongen1234567 Jul 18, 2025
Collaborator

liuhongen1234567 Jul 21, 2025
Collaborator

NikitinaMaria
Aug 7, 2025 — with giscus