latest/version3.x/deployment/high_performance_inference #16425

2025-09-09T03:24:11Z

giscus[bot]
bot Sep 9, 2025

latest/version3.x/deployment/high_performance_inference

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

https://www.paddleocr.ai/latest/version3.x/deployment/high_performance_inference.html

tatocode · 2025-09-09T03:24:13Z

tatocode
Sep 9, 2025 — with giscus

当我使用A800 GPU部署PaddleOCR服务时，选用高性能推理，执行安装指令：

paddleocr install_hpi_deps gpu

hpi高性能推理模块已成功安装。

但首次运行脚本时：

from paddleocr import PaddleOCR

ocr = PaddleOCR(
    use_doc_orientation_classify=False, 
    use_doc_unwarping=False, 
    use_textline_orientation=True,
    enable_hpi=True) # 文本检测+文本识别

result = ocr.predict("images/2025-07-31_03_33_13.jpg")
for res in result:
    res.print()
    res.save_to_img("output")
    res.save_to_json("output")

会自动构建TensorRT Engine，这期间发生错误：

/root/miniconda3/envs/paddle/lib/python3.10/site-packages/paddle/utils/cpp_extension/extension_utils.py:711: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
  warnings.warn(warning_message)
WARNING: OMP_NUM_THREADS set to 14, not 1. The computation speed will not be optimized if you use data parallel. It will fail if this PaddlePaddle binary is compiled with OpenBlas since OpenBlas does not support multi-threads.
PLEASE USE OMP_NUM_THREADS WISELY.
Creating model: ('PP-LCNet_x1_0_textline_ori', None)
Model files already exist. Using cached files. To redownload, please delete the directory manually: `/root/.paddlex/official_models/PP-LCNet_x1_0_textline_ori`.
The Paddle Inference backend is selected with the default configuration. This may not provide optimal performance.
Using Paddle Inference backend
Paddle predictor option: device_type: gpu,  device_id: 0,  run_mode: paddle,  trt_dynamic_shapes: {'x': [[1, 3, 80, 160], [1, 3, 80, 160], [8, 3, 80, 160]]},  cpu_threads: 10,  delete_pass: [],  enable_new_ir: True,  enable_cinn: False,  trt_cfg_setting: {},  trt_use_dynamic_shapes: True,  trt_collect_shape_range_info: True,  trt_discard_cached_shape_range_info: False,  trt_dynamic_shape_input_data: None,  trt_shape_range_info_path: None,  trt_allow_rebuild_at_runtime: True,  mkldnn_cache_capacity: 10
Creating model: ('PP-OCRv5_server_det', None)
Model files already exist. Using cached files. To redownload, please delete the directory manually: `/root/.paddlex/official_models/PP-OCRv5_server_det`.
Inference backend: tensorrt
Inference backend config: precision='fp32' use_dynamic_shapes=True dynamic_shapes={'x': [[1, 3, 32, 32], [1, 3, 736, 736], [1, 3, 4000, 4000]]}
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(572)::BuildTrtEngine        Start to building TensorRT Engine...
[ERROR] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(239)::log  1: No Myelin Error exists
[ERROR] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(239)::log  1: [runnerBuilderBase.cpp::buildAndSerializeMyelinGraph::326] Error Code 1: Myelin (No Myelin Error exists)
[ERROR] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(636)::BuildTrtEngine       Failed to call buildSerializedNetwork().
[ERROR] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(752)::CreateTrtEngineFromOnnx      Failed to build tensorrt engine.
[INFO] ultra_infer/runtime/runtime.cc(320)::CreateTrtBackend    Runtime initialized with Backend::TRT in Device::GPU.
Creating model: ('PP-OCRv5_server_rec', None)
Model files already exist. Using cached files. To redownload, please delete the directory manually: `/root/.paddlex/official_models/PP-OCRv5_server_rec`.
TensorRT dynamic shapes will be loaded from the file.
Inference backend: tensorrt
Inference backend config: precision='fp16' use_dynamic_shapes=True dynamic_shapes={'x': [[1, 3, 48, 160], [1, 3, 48, 320], [8, 3, 48, 3200]]}
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(719)::CreateTrtEngineFromOnnx       Detect serialized TensorRT Engine file in /root/.paddlex/official_models/PP-OCRv5_server_rec/.cache/tensorrt/trt_serialized.trt, will load it directly.
[WARNING] ultra_infer/runtime/backends/tensorrt/utils.cc(40)::Update    [New Shape Out of Range] input name: x, shape: [8, 3, 48, 3200], The shape range before: min_shape=[-1, 3, 48, -1], max_shape=[-1, 3, 48, -1].
[WARNING] ultra_infer/runtime/backends/tensorrt/utils.cc(52)::Update    [New Shape Out of Range] The updated shape range now: min_shape=[8, 3, 48, 3200], max_shape=[8, 3, 48, 3200].
[WARNING] ultra_infer/runtime/backends/tensorrt/utils.cc(40)::Update    [New Shape Out of Range] input name: x, shape: [1, 3, 48, 160], The shape range before: min_shape=[8, 3, 48, 3200], max_shape=[8, 3, 48, 3200].
[WARNING] ultra_infer/runtime/backends/tensorrt/utils.cc(52)::Update    [New Shape Out of Range] The updated shape range now: min_shape=[1, 3, 48, 160], max_shape=[8, 3, 48, 3200].
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(108)::LoadTrtCache  Build TensorRT Engine from cache file: /root/.paddlex/official_models/PP-OCRv5_server_rec/.cache/tensorrt/trt_serialized.trt with shape range information as below,
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(111)::LoadTrtCache  Input name: x, shape=[-1, 3, 48, -1], min=[1, 3, 48, 160], max=[8, 3, 48, 3200]

[ERROR] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(239)::log  3: [runtime.cpp::~Runtime::346] Error Code 3: API Usage Error (Parameter check failed at: runtime/rt/runtime.cpp::~Runtime::346, condition: mEngineCounter.use_count() == 1. Destroying a runtime before destroying deserialized engines created by the runtime leads to undefined behavior.
)
[INFO] ultra_infer/runtime/runtime.cc(320)::CreateTrtBackend    Runtime initialized with Backend::TRT in Device::GPU.
[ERROR] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(445)::SetInputs    TRTBackend SetInputs not find name:x


--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
No stack trace in paddle, may be caused by external reasons.

----------------------
Error Message Summary:
----------------------
FatalError: `Process abort signal` is detected by the operating system.
  [TimeInfo: *** Aborted at 1757387445 (unix time) try "date -d @1757387445" if you are using GNU date ***]
  [SignalInfo: *** SIGABRT (@0x2556) received by PID 9558 (TID 0x7f96c25bd740) from PID 9558 ***]

Aborted (core dumped)

请问这种情况一般是什么原因呢？有无解决方案？

0 replies

Zhangxy8244 · 2025-09-10T07:15:10Z

Zhangxy8244
Sep 10, 2025 — with giscus

为啥高性能推理的速度还不如不开高性能的速度快呢，是我哪里配置出问题啦？

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

latest/version3.x/deployment/high_performance_inference #16425

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

latest/version3.x/deployment/high_performance_inference #16425

Uh oh!

giscus[bot] bot Sep 9, 2025

latest/version3.x/deployment/high_performance_inference

Replies: 2 comments

Uh oh!

tatocode Sep 9, 2025 — with giscus

Uh oh!

Zhangxy8244 Sep 10, 2025 — with giscus

giscus[bot]
bot Sep 9, 2025

tatocode
Sep 9, 2025 — with giscus

Zhangxy8244
Sep 10, 2025 — with giscus