You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Convert models to ONNX format: [Obtaining ONNX Models](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/deployment/obtaining_onnx_models.html).
374
374
- Accelerate inference using engines like OpenVINO, ONNX Runtime, TensorRT, or perform inference using ONNX format models: [High-Performance Inference](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/deployment/high_performance_inference.html).
@@ -394,16 +394,32 @@ print(chat_result)
394
394
</p>
395
395
</div>
396
396
397
+
398
+
## ✨ Stay Tuned
399
+
400
+
⭐ **Star this repository to keep up with exciting updates and new releases, including powerful OCR and document parsing capabilities!** ⭐
PaddleOCR wouldn't be where it is today without its incredible community! 💗 A massive thank you to all our longtime partners, new collaborators, and everyone who's poured their passion into PaddleOCR — whether we've named you or not. Your support fuels our fire!
406
420
421
+
<divalign="center">
422
+
407
423
| Project Name | Description |
408
424
| ------------ | ----------- |
409
425
|[RAGFlow](https://github.com/infiniflow/ragflow) <ahref="https://github.com/infiniflow/ragflow"><imgsrc="https://img.shields.io/github/stars/infiniflow/ragflow"></a>|RAG engine based on deep document understanding.|
@@ -414,17 +430,23 @@ PaddleOCR wouldn't be where it is today without its incredible community! 💗 A
414
430
|[PDF-Extract-Kit](https://github.com/opendatalab/PDF-Extract-Kit) <ahref="https://github.com/opendatalab/PDF-Extract-Kit"><imgsrc="https://img.shields.io/github/stars/opendatalab/PDF-Extract-Kit"></a>|A powerful open-source toolkit designed to efficiently extract high-quality content from complex and diverse PDF documents.|
415
431
|[Dango-Translator](https://github.com/PantsuDango/Dango-Translator)<ahref="https://github.com/PantsuDango/Dango-Translator"><imgsrc="https://img.shields.io/github/stars/PantsuDango/Dango-Translator"></a> |Recognize text on the screen, translate it and show the translation results in real time.|
416
432
|[Learn more projects](./awesome_projects.md)|[More projects based on PaddleOCR](./awesome_projects.md)|
[](https://star-history.com/#PaddlePaddle/PaddleOCR&Date)
Copy file name to clipboardExpand all lines: docs/version3.x/algorithm/PP-StructureV3/PP-StructureV3.en.md
+21-10Lines changed: 21 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1507,22 +1507,33 @@ The serving inference test is based on the NVIDIA A100 + Intel Xeon Platinum 835
1507
1507
1508
1508
# FAQ
1509
1509
1510
-
1. What is the default configuration? How to get higher accuracy, faster speed, or smaller GPU memory?
1510
+
**Q: What is the default model configuration? If I need higher accuracy, faster speed, or lower GPU memory usage, which parameters should I adjust or which models should I switch to? How significant is the impact on results?**
1511
1511
1512
-
When using mobile OCR models + PP-FormulaNet_plus-M, and max length of text detection set to 1200, if set use_chart_recognition to False and dont not load the chart recognition model, the GPU memory would be reduced.
1512
+
**A:** By default, the largest models for each module are used. Section 3.3 demonstrates how different model selections affect GPU memory consumption and inference speed. You can choose an appropriate model based on your device capabilities and the complexity of your samples. Additionally, in the Python API or CLI, you can set the device parameter as `<device_type>:<device_id1>,<device_id2>...` (e.g., `gpu:0,1,2,3`) to enable multi-GPU parallel inference. If the built-in multi-GPU parallel inference does not meet your speed requirements, you may refer to the example code for multi-process parallel inference and further optimize it for your specific scenario: [Multi-process Parallel Inference](https://www.paddleocr.ai/latest/en/version3.x/pipeline_usage/instructions/parallel_inference.html).
1513
1513
1514
-
On the V100, the peak and average GPU memory would be reduced from 8776.0 MB and 8680.8 MB to 6118.0 MB and 6016.7 MB, respectively; On the A100, the peak and average GPU memory would be reduced from 11716.0 MB and 11453.9 MB to 9850.0 MB and 9593.5 MB, respectively.
1514
+
---
1515
1515
1516
-
You can using multi-gpus by setting `device` to `gpu:<no.>,<no.>`, such as `gpu:0,1,2,3`. And about multi-process parallel inference, you can refer: [Multi-Process Parallel Inference](https://github.com/PaddlePaddle/PaddleX/blob/develop/docs/pipeline_usage/instructions/parallel_inference.en.md#example-of-multi-process-parallel-inference).
1516
+
**Q: Can PP-StructureV3 run on CPU?**
1517
1517
1518
-
2. About serving deployment
1518
+
**A:** While PP-StructureV3 is recommended to run on GPU for optimal performance, it also supports CPU inference. Thanks to a variety of configuration options and sufficient optimization for lightweight models, users can refer to section 3.3 to select lightweight configurations for CPU-only environments. For example, on an Intel 8350C CPU, the inference time per image is about 3.74 seconds.
1519
1519
1520
-
(1) Can the service handle requests concurrently?
1520
+
---
1521
1521
1522
-
For the basic serving deployment solution, the service processes only one request at a time. This plan is mainly used for rapid verification, to establish the development chain, or for scenarios where concurrent requests are not required.
1522
+
**Q: How can I integrate PP-StructureV3 into my own project?**
1523
1523
1524
-
For high-stability serving deployment solution, the service process only one request at a time by default, but you can refer to the related docs to adjust achieve scaling.
1524
+
**A:**
1525
+
- For Python projects, you can directly integrate using the PaddleOCR Python API.
1526
+
- For projects in other programming languages, it is recommended to use service-based deployment. PaddleOCR supports client-side integration in multiple languages, including C++, C#, Java, Go, and PHP. Please refer to the [official documentation](https://www.paddleocr.ai/latest/en/version3.x/pipeline_usage/PP-StructureV3.html#3-development-integration-deployment) for details.
1527
+
- If you need to interact with large language models, PaddleOCR also provides the MCP service. For more information, please refer to the [MCP Server documentation](https://www.paddleocr.ai/latest/en/version3.x/deployment/mcp_server.html).
1525
1528
1526
-
(2)How to reduce latency and improve throughput?
1529
+
---
1527
1530
1528
-
Use the High-performance inference plugin, and deploy multi instances.
1531
+
**Q: Can the serving handle concurrent requests?**
1532
+
1533
+
**A:** In the basic service deployment solution, the service processes only one request at a time, which is mainly intended for quick validation, development pipeline integration, or scenarios that do not require concurrent processing. In the high-stability serving solution, the service also processes one request at a time by default, but users can achieve horizontal scaling and concurrent processing by adjusting the configuration as outlined in the service deployment guide.
1534
+
1535
+
---
1536
+
1537
+
**Q: How can I reduce latency and increase throughput in serving?**
1538
+
1539
+
**A:** PaddleOCR offers two types of service deployment solutions. Regardless of the solution used, enabling high-performance inference plugins can accelerate model inference and thus reduce latency. For the high-stability deployment solution, throughput can be further increased by adjusting the service configuration to run multiple instances, making full use of your hardware resources. Please refer to the [documentation](https://paddlepaddle.github.io/PaddleX/latest/en/pipeline_deploy/serving.html#22-adjust-configurations) for more details on configuring high-stability serving.
Copy file name to clipboardExpand all lines: docs/version3.x/deployment/high_performance_inference.en.md
+4Lines changed: 4 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,3 +1,7 @@
1
+
---
2
+
comments: true
3
+
---
4
+
1
5
# High-Performance Inference
2
6
3
7
In real-world production environments, many applications have stringent performance requirements for deployment strategies, particularly regarding response speed, to ensure efficient system operation and a smooth user experience. PaddleOCR provides high-performance inference capabilities, allowing users to enhance model inference speed with a single click without worrying about complex configurations or underlying details. Specifically, PaddleOCR's high-performance inference functionality can:
Copy file name to clipboardExpand all lines: docs/version3.x/deployment/obtaining_onnx_models.en.md
+4Lines changed: 4 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,3 +1,7 @@
1
+
---
2
+
comments: true
3
+
---
4
+
1
5
# Obtaining ONNX Models
2
6
3
7
PaddleOCR provides a rich collection of pre-trained models, all stored in PaddlePaddle's static graph format. To use these models in ONNX format during deployment, you can convert them using the Paddle2ONNX plugin provided by PaddleX. For more information about PaddleX and its relationship with PaddleOCR, refer to [Differences and Connections Between PaddleOCR and PaddleX](../paddleocr_and_paddlex.en.md#1-Differences-and-Connections-Between-PaddleOCR-and-PaddleX).
0 commit comments