Skip to content

Commit 2cf629d

Browse files
authored
Update docs for a better experience. (#16419)
1 parent f6ef577 commit 2cf629d

37 files changed

+340
-94
lines changed

README.md

Lines changed: 26 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ English | [简体中文](./readme/README_cn.md) | [繁體中文](./readme/README
99
[![arXiv](https://img.shields.io/badge/arXiv-2507.05595-b31b1b.svg?logo=arXiv)](https://arxiv.org/abs/2507.05595)
1010
[![PyPI Downloads](https://static.pepy.tech/badge/paddleocr/month)](https://pepy.tech/project/paddleocr)
1111
[![PyPI Downloads](https://static.pepy.tech/badge/paddleocr)](https://pepy.tech/project/paddleocr)
12-
[![Used by](https://img.shields.io/badge/Used%20by-5.8k%2B%20repositories-blue)](https://github.com/PaddlePaddle/PaddleOCR/network/dependents)
12+
[![Used by](https://img.shields.io/badge/Used%20by-5.9k%2B%20repositories-blue)](https://github.com/PaddlePaddle/PaddleOCR/network/dependents)
1313

1414
![python](https://img.shields.io/badge/python-3.8~3.12-aff.svg)
1515
![os](https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-pink.svg)
@@ -368,7 +368,7 @@ print(chat_result)
368368
- [Huawei Ascend](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/other_devices_support/paddlepaddle_install_NPU.html)
369369
- [KUNLUNXIN](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/other_devices_support/paddlepaddle_install_XPU.html)
370370

371-
## More Features
371+
## 🧩 More Features
372372

373373
- Convert models to ONNX format: [Obtaining ONNX Models](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/deployment/obtaining_onnx_models.html).
374374
- Accelerate inference using engines like OpenVINO, ONNX Runtime, TensorRT, or perform inference using ONNX format models: [High-Performance Inference](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/deployment/high_performance_inference.html).
@@ -394,16 +394,32 @@ print(chat_result)
394394
</p>
395395
</div>
396396

397+
398+
## ✨ Stay Tuned
399+
400+
**Star this repository to keep up with exciting updates and new releases, including powerful OCR and document parsing capabilities!**
401+
402+
<div align="center">
403+
<p>
404+
<img width="1200" src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/paddleocr/README/star_paddleocr.en.gif" alt="Star-Project">
405+
</p>
406+
</div>
407+
397408
## 👩‍👩‍👧‍👦 Community
398409

410+
<div align="center">
411+
399412
| PaddlePaddle WeChat official account | Join the tech discussion group |
400413
| :---: | :---: |
401414
| <img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/paddleocr/README/qrcode_for_paddlepaddle_official_account.jpg" width="150"> | <img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/paddleocr/README/qr_code_for_the_questionnaire.jpg" width="150"> |
415+
</div>
402416

403417

404418
## 😃 Awesome Projects Leveraging PaddleOCR
405419
PaddleOCR wouldn't be where it is today without its incredible community! 💗 A massive thank you to all our longtime partners, new collaborators, and everyone who's poured their passion into PaddleOCR — whether we've named you or not. Your support fuels our fire!
406420

421+
<div align="center">
422+
407423
| Project Name | Description |
408424
| ------------ | ----------- |
409425
| [RAGFlow](https://github.com/infiniflow/ragflow) <a href="https://github.com/infiniflow/ragflow"><img src="https://img.shields.io/github/stars/infiniflow/ragflow"></a>|RAG engine based on deep document understanding.|
@@ -414,17 +430,23 @@ PaddleOCR wouldn't be where it is today without its incredible community! 💗 A
414430
| [PDF-Extract-Kit](https://github.com/opendatalab/PDF-Extract-Kit) <a href="https://github.com/opendatalab/PDF-Extract-Kit"><img src="https://img.shields.io/github/stars/opendatalab/PDF-Extract-Kit"></a>|A powerful open-source toolkit designed to efficiently extract high-quality content from complex and diverse PDF documents.|
415431
| [Dango-Translator](https://github.com/PantsuDango/Dango-Translator)<a href="https://github.com/PantsuDango/Dango-Translator"><img src="https://img.shields.io/github/stars/PantsuDango/Dango-Translator"></a> |Recognize text on the screen, translate it and show the translation results in real time.|
416432
| [Learn more projects](./awesome_projects.md) | [More projects based on PaddleOCR](./awesome_projects.md)|
433+
</div>
417434

418435
## 👩‍👩‍👧‍👦 Contributors
419436

437+
<div align="center">
420438
<a href="https://github.com/PaddlePaddle/PaddleOCR/graphs/contributors">
421439
<img src="https://contrib.rocks/image?repo=PaddlePaddle/PaddleOCR&max=400&columns=20" width="800"/>
422440
</a>
423-
441+
</div>
424442

425443
## 🌟 Star
426444

427-
[![Star History Chart](https://api.star-history.com/svg?repos=PaddlePaddle/PaddleOCR&type=Date)](https://star-history.com/#PaddlePaddle/PaddleOCR&Date)
445+
<div align="center">
446+
<p>
447+
<img width="800" src="https://api.star-history.com/svg?repos=PaddlePaddle/PaddleOCR&type=Date" alt="Star-history">
448+
</p>
449+
</div>
428450

429451

430452
## 📄 License

docs/quick_start.en.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ Since GPU installation requires specific CUDA versions, the following example is
2222
python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/
2323
```
2424

25-
**Please note that PaddleOCR depends on PaddlePaddle version `3.0` or above.**
25+
**Please note that PaddleOCR 3.x depends on PaddlePaddle version `3.0` or above.**
2626

2727
#### 2. Install `paddleocr`
2828

docs/quick_start.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ GPU端安装,由于GPU端需要根据具体CUDA版本来对应安装使用,
2020
python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/
2121
```
2222

23-
**请注意,PaddleOCR 依赖于 `3.0` 及以上版本的飞桨框架。**
23+
**请注意,PaddleOCR 3.x版本 依赖于 `3.0` 及以上版本的飞桨框架。**
2424

2525
#### 2. 安装`paddleocr`
2626

docs/version3.x/algorithm/PP-StructureV3/PP-StructureV3.en.md

Lines changed: 21 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1507,22 +1507,33 @@ The serving inference test is based on the NVIDIA A100 + Intel Xeon Platinum 835
15071507

15081508
# FAQ
15091509

1510-
1. What is the default configuration? How to get higher accuracy, faster speed, or smaller GPU memory?
1510+
**Q: What is the default model configuration? If I need higher accuracy, faster speed, or lower GPU memory usage, which parameters should I adjust or which models should I switch to? How significant is the impact on results?**
15111511

1512-
When using mobile OCR models + PP-FormulaNet_plus-M, and max length of text detection set to 1200, if set use_chart_recognition to False and dont not load the chart recognition model, the GPU memory would be reduced.
1512+
**A:** By default, the largest models for each module are used. Section 3.3 demonstrates how different model selections affect GPU memory consumption and inference speed. You can choose an appropriate model based on your device capabilities and the complexity of your samples. Additionally, in the Python API or CLI, you can set the device parameter as `<device_type>:<device_id1>,<device_id2>...` (e.g., `gpu:0,1,2,3`) to enable multi-GPU parallel inference. If the built-in multi-GPU parallel inference does not meet your speed requirements, you may refer to the example code for multi-process parallel inference and further optimize it for your specific scenario: [Multi-process Parallel Inference](https://www.paddleocr.ai/latest/en/version3.x/pipeline_usage/instructions/parallel_inference.html).
15131513

1514-
On the V100, the peak and average GPU memory would be reduced from 8776.0 MB and 8680.8 MB to 6118.0 MB and 6016.7 MB, respectively; On the A100, the peak and average GPU memory would be reduced from 11716.0 MB and 11453.9 MB to 9850.0 MB and 9593.5 MB, respectively.
1514+
---
15151515

1516-
You can using multi-gpus by setting `device` to `gpu:<no.>,<no.>`, such as `gpu:0,1,2,3`. And about multi-process parallel inference, you can refer: [Multi-Process Parallel Inference](https://github.com/PaddlePaddle/PaddleX/blob/develop/docs/pipeline_usage/instructions/parallel_inference.en.md#example-of-multi-process-parallel-inference).
1516+
**Q: Can PP-StructureV3 run on CPU?**
15171517

1518-
2. About serving deployment
1518+
**A:** While PP-StructureV3 is recommended to run on GPU for optimal performance, it also supports CPU inference. Thanks to a variety of configuration options and sufficient optimization for lightweight models, users can refer to section 3.3 to select lightweight configurations for CPU-only environments. For example, on an Intel 8350C CPU, the inference time per image is about 3.74 seconds.
15191519

1520-
(1) Can the service handle requests concurrently?
1520+
---
15211521

1522-
For the basic serving deployment solution, the service processes only one request at a time. This plan is mainly used for rapid verification, to establish the development chain, or for scenarios where concurrent requests are not required.
1522+
**Q: How can I integrate PP-StructureV3 into my own project?**
15231523

1524-
For high-stability serving deployment solution, the service process only one request at a time by default, but you can refer to the related docs to adjust achieve scaling.
1524+
**A:**
1525+
- For Python projects, you can directly integrate using the PaddleOCR Python API.
1526+
- For projects in other programming languages, it is recommended to use service-based deployment. PaddleOCR supports client-side integration in multiple languages, including C++, C#, Java, Go, and PHP. Please refer to the [official documentation](https://www.paddleocr.ai/latest/en/version3.x/pipeline_usage/PP-StructureV3.html#3-development-integration-deployment) for details.
1527+
- If you need to interact with large language models, PaddleOCR also provides the MCP service. For more information, please refer to the [MCP Server documentation](https://www.paddleocr.ai/latest/en/version3.x/deployment/mcp_server.html).
15251528

1526-
(2)How to reduce latency and improve throughput?
1529+
---
15271530

1528-
Use the High-performance inference plugin, and deploy multi instances.
1531+
**Q: Can the serving handle concurrent requests?**
1532+
1533+
**A:** In the basic service deployment solution, the service processes only one request at a time, which is mainly intended for quick validation, development pipeline integration, or scenarios that do not require concurrent processing. In the high-stability serving solution, the service also processes one request at a time by default, but users can achieve horizontal scaling and concurrent processing by adjusting the configuration as outlined in the service deployment guide.
1534+
1535+
---
1536+
1537+
**Q: How can I reduce latency and increase throughput in serving?**
1538+
1539+
**A:** PaddleOCR offers two types of service deployment solutions. Regardless of the solution used, enabling high-performance inference plugins can accelerate model inference and thus reduce latency. For the high-stability deployment solution, throughput can be further increased by adjusting the service configuration to run multiple instances, making full use of your hardware resources. Please refer to the [documentation](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_deploy/serving.html#22-adjust-configurations) for more details on configuring high-stability serving.

docs/version3.x/algorithm/PP-StructureV3/PP-StructureV3.md

Lines changed: 28 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,13 @@
11
# 一、PP-StructureV3 简介
2-
**PP-StructureV3** 产线在通用版面解析v1产线的基础上,强化了版面区域检测、表格识别、公式识别的能力,增加了图表理解和多栏阅读顺序的恢复能力、结果转换 Markdown 文件的能力,在多种文档数据中,表现优异,可以处理较复杂的文档数据。本产线同时提供了灵活的服务化部署方式,支持在多种硬件上使用多种编程语言调用。不仅如此,本产线也提供了二次开发的能力,您可以基于本产线在您自己的数据集上训练调优,训练后的模型也可以无缝集成
2+
PP-StructureV3 能够将文档图像和 PDF 文件高效转换为结构化内容(如 Markdown 格式),并具备版面区域检测、表格识别、公式识别、图表理解以及多栏阅读顺序恢复等强大功能。该工具在多种文档类型下均表现优异,能够处理复杂的文档数据。PP-StructureV3 支持灵活的服务化部署,兼容多种硬件环境,并可通过多种编程语言进行调用。同时,支持二次开发,用户可以基于自有数据集进行模型训练和优化,训练后的模型可实现无缝集成
33

44
<div align="center">
5-
<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/paddleocr/PP-StructureV3/algorithm_ppstructurev3.png" width="600"/>
5+
<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/paddleocr/PP-StructureV3/algorithm_ppstructurev3.png" width="800"/>
66
</div>
77

88
# 二、关键指标
9+
10+
<div align="center">
911
<table>
1012
<thead>
1113
<tr>
@@ -296,6 +298,7 @@
296298
</tr>
297299
</tbody>
298300
</table>
301+
</div>
299302

300303
以上部分数据出自:
301304
* <a href="https://github.com/opendatalab/OmniDocBench">OmniDocBench</a>
@@ -1492,27 +1495,39 @@
14921495

14931496
<div align="center">
14941497
<img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/paddleocr/PP-StructureV3/algorithm_ppstructurev3_demo.png" width="600"/>
1495-
</div>
1498+
</div>
14961499

14971500
<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex%2FPaddleX3.0%2Fdoc_images%2FPP-StructureV3%2Falgorithm_ppstructurev3_demo.pdf">更多示例</a>
14981501

14991502
# 五、使用方法和常见问题
15001503

1501-
1. 默认模型是什么配置,如果需要更高精度、更快速度、或者更小显存,应该调哪些参数或者更换哪些模型,对结果影响大概有多大?
1504+
**Q:默认模型是什么配置,如果需要更高精度、更快速度、或者更小显存,应该调哪些参数或者更换哪些模型,对结果影响大概有多大?**
1505+
1506+
**A:** 默认模型均采用了了各个模块参数量最大的模型,3.3 章节中展示了不同的模型选择对于显存和推理速度的影响。可以根据设备情况和样本难易程度选择合适的模型。另外,在 Python API 或 CLI 设置 device 为<设备类型>:<设备编号1>,<设备编号2>...(例如gpu:0,1,2,3)可实现多卡并行推理。如果内置的多卡并行推理功能提速效果仍不满足预期,可参考多进程并行推理示例代码,结合具体场景进行进一步优化:[多进程并行推理](https://www.paddleocr.ai/latest/version3.x/pipeline_usage/instructions/parallel_inference.html)
1507+
1508+
---
1509+
1510+
**Q: PP-StructureV3 是否可以在 CPU 上运行?**
1511+
1512+
**A:** PP-StructureV3 虽然更推荐在 GPU 环境下进行推理,但也支持在 CPU 上运行。得益于多种配置选项及对轻量级模型的充分优化,在仅有 CPU 环境时,用户可以参考 3.3 节选择轻量化配置进行推理。例如,在 Intel 8350C CPU 上,每张图片的推理时间约为 3.74 秒。
1513+
1514+
---
15021515

1503-
在“使用轻量OCR模型+轻量公式模型,文本检测max 1200”的基础上,将产线配置文件中的use_chart_recognition设置为False,不加载图表识别模型,可以进一步减少显存用量。在V100测试环境中,峰值和平均显存用量分别从8776.0 MB和8680.8 MB降低到6118.0 MB和6016.7 MB;在A100测试环境中,峰值和平均显存用量分别从11716.0 MB和11453.9 MB降低到9850.0 MB和9593.5 MB。
1504-
在Python API或CLI设置device为<设备类型>:<设备编号1>,<设备编号2>...(例如gpu:0,1,2,3)可实现多卡并行推理。如果内置的多卡并行推理功能提速效果仍不满足预期,可参考多进程并行推理示例代码,结合具体场景进行进一步优化:[多进程并行推理](https://github.com/PaddlePaddle/PaddleX/blob/develop/docs/pipeline_usage/instructions/parallel_inference.md#%E5%A4%9A%E8%BF%9B%E7%A8%8B%E5%B9%B6%E8%A1%8C%E6%8E%A8%E7%90%86%E7%A4%BA%E4%BE%8B)
1516+
**Q: 如何将 PP-StructureV3 集成到自己的项目中?**
15051517

1506-
2. 服务化部署的常见问题
1518+
**A:**
1519+
- 对于 Python 项目,可以直接使用 PaddleOCR 的 Python API 完成集成。
1520+
- 对于其他编程语言,建议通过服务化部署方式集成。PaddleOCR 支持包括 C++、C#、Java、Go、PHP 等多种语言的客户端调用方式,具体集成方法可参考 [官方文档](https://www.paddleocr.ai/latest/version3.x/pipeline_usage/PP-StructureV3.html#3)
1521+
- 如果需要与大模型进行交互,PaddleOCR 还提供了 MCP 服务,详细说明可参考 [MCP 服务器](https://www.paddleocr.ai/latest/version3.x/deployment/mcp_server.html)
15071522

1508-
(1)服务可以并发处理请求吗?
1523+
---
15091524

1510-
对于基础服务化部署方案,服务同一时间只处理一个请求,该方案主要用于快速验证、打通开发链路,或者用在不需要并发请求的场景;
1525+
**Q:服务化部署可以并发处理请求吗?**
15111526

1512-
对于高稳定性服务化部署方案,服务默认在同一时间只处理一个请求,但用户可以参考服务化部署指南,通过调整配置实现水平扩展,以使服务同时处理多个请求。
1527+
**A:** 对于基础服务化部署方案,服务同一时间只处理一个请求,该方案主要用于快速验证、打通开发链路,或者用在不需要并发请求的场景;对于高稳定性服务化部署方案,服务默认在同一时间只处理一个请求,但用户可以参考服务化部署指南,通过调整配置实现水平扩展,以使服务同时处理多个请求。
15131528

1514-
(2)如何降低时延、提升吞吐?
1529+
---
15151530

1516-
无论使用哪一种服务化部署方案,都可以通过启用高性能推理插件提升模型推理速度,从而降低处理时延。
1531+
**Q: 服务化部署如何降低时延、提升吞吐?**
15171532

1518-
此外,对于高稳定性服务化部署方案,通过调整服务配置,设置多个实例,也可以充分利用部署机器的资源,有效提升吞吐。
1533+
**A:** PaddleOCR 提供的2种服务化部署方案,无论使用哪一种方案,都可以通过启用高性能推理插件提升模型推理速度,从而降低处理时延。此外,对于高稳定性服务化部署方案,通过调整服务配置,设置多个实例,也可以充分利用部署机器的资源,有效提升吞吐。高稳定性服务化部署方案调整配置可以参考[文档](https://paddlepaddle.github.io/PaddleX/latest/pipeline_deploy/serving.html#22)

docs/version3.x/deployment/high_performance_inference.en.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
---
2+
comments: true
3+
---
4+
15
# High-Performance Inference
26

37
In real-world production environments, many applications have stringent performance requirements for deployment strategies, particularly regarding response speed, to ensure efficient system operation and a smooth user experience. PaddleOCR provides high-performance inference capabilities, allowing users to enhance model inference speed with a single click without worrying about complex configurations or underlying details. Specifically, PaddleOCR's high-performance inference functionality can:

docs/version3.x/deployment/high_performance_inference.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
---
2+
comments: true
3+
---
4+
15
# 高性能推理
26

37
在实际生产环境中,许多应用对部署策略的性能指标(尤其是响应速度)有着较严苛的标准,以确保系统的高效运行与用户体验的流畅性。PaddleOCR 提供高性能推理能力,让用户无需关注复杂的配置和底层细节,一键提升模型的推理速度。具体而言,PaddleOCR 的高性能推理功能能够:

docs/version3.x/deployment/mcp_server.en.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
---
2+
comments: true
3+
---
4+
15
# PaddleOCR MCP Server
26

37
[![PaddleOCR](https://img.shields.io/badge/OCR-PaddleOCR-orange)](https://github.com/PaddlePaddle/PaddleOCR)

docs/version3.x/deployment/mcp_server.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
---
2+
comments: true
3+
---
4+
15
# PaddleOCR MCP 服务器
26

37
[![PaddleOCR](https://img.shields.io/badge/OCR-PaddleOCR-orange)](https://github.com/PaddlePaddle/PaddleOCR)

docs/version3.x/deployment/obtaining_onnx_models.en.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
---
2+
comments: true
3+
---
4+
15
# Obtaining ONNX Models
26

37
PaddleOCR provides a rich collection of pre-trained models, all stored in PaddlePaddle's static graph format. To use these models in ONNX format during deployment, you can convert them using the Paddle2ONNX plugin provided by PaddleX. For more information about PaddleX and its relationship with PaddleOCR, refer to [Differences and Connections Between PaddleOCR and PaddleX](../paddleocr_and_paddlex.en.md#1-Differences-and-Connections-Between-PaddleOCR-and-PaddleX).

0 commit comments

Comments
 (0)