latest/version3.x/deployment/high_performance_inference #16425
Unanswered
Replies: 2 comments
-
当我使用A800 GPU部署PaddleOCR服务时,选用高性能推理,执行安装指令: paddleocr install_hpi_deps gpu hpi高性能推理模块已成功安装。 但首次运行脚本时: from paddleocr import PaddleOCR
ocr = PaddleOCR(
use_doc_orientation_classify=False,
use_doc_unwarping=False,
use_textline_orientation=True,
enable_hpi=True) # 文本检测+文本识别
result = ocr.predict("images/2025-07-31_03_33_13.jpg")
for res in result:
res.print()
res.save_to_img("output")
res.save_to_json("output") 会自动构建TensorRT Engine,这期间发生错误: /root/miniconda3/envs/paddle/lib/python3.10/site-packages/paddle/utils/cpp_extension/extension_utils.py:711: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
warnings.warn(warning_message)
WARNING: OMP_NUM_THREADS set to 14, not 1. The computation speed will not be optimized if you use data parallel. It will fail if this PaddlePaddle binary is compiled with OpenBlas since OpenBlas does not support multi-threads.
PLEASE USE OMP_NUM_THREADS WISELY.
Creating model: ('PP-LCNet_x1_0_textline_ori', None)
Model files already exist. Using cached files. To redownload, please delete the directory manually: `/root/.paddlex/official_models/PP-LCNet_x1_0_textline_ori`.
The Paddle Inference backend is selected with the default configuration. This may not provide optimal performance.
Using Paddle Inference backend
Paddle predictor option: device_type: gpu, device_id: 0, run_mode: paddle, trt_dynamic_shapes: {'x': [[1, 3, 80, 160], [1, 3, 80, 160], [8, 3, 80, 160]]}, cpu_threads: 10, delete_pass: [], enable_new_ir: True, enable_cinn: False, trt_cfg_setting: {}, trt_use_dynamic_shapes: True, trt_collect_shape_range_info: True, trt_discard_cached_shape_range_info: False, trt_dynamic_shape_input_data: None, trt_shape_range_info_path: None, trt_allow_rebuild_at_runtime: True, mkldnn_cache_capacity: 10
Creating model: ('PP-OCRv5_server_det', None)
Model files already exist. Using cached files. To redownload, please delete the directory manually: `/root/.paddlex/official_models/PP-OCRv5_server_det`.
Inference backend: tensorrt
Inference backend config: precision='fp32' use_dynamic_shapes=True dynamic_shapes={'x': [[1, 3, 32, 32], [1, 3, 736, 736], [1, 3, 4000, 4000]]}
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(572)::BuildTrtEngine Start to building TensorRT Engine...
[ERROR] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(239)::log 1: No Myelin Error exists
[ERROR] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(239)::log 1: [runnerBuilderBase.cpp::buildAndSerializeMyelinGraph::326] Error Code 1: Myelin (No Myelin Error exists)
[ERROR] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(636)::BuildTrtEngine Failed to call buildSerializedNetwork().
[ERROR] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(752)::CreateTrtEngineFromOnnx Failed to build tensorrt engine.
[INFO] ultra_infer/runtime/runtime.cc(320)::CreateTrtBackend Runtime initialized with Backend::TRT in Device::GPU.
Creating model: ('PP-OCRv5_server_rec', None)
Model files already exist. Using cached files. To redownload, please delete the directory manually: `/root/.paddlex/official_models/PP-OCRv5_server_rec`.
TensorRT dynamic shapes will be loaded from the file.
Inference backend: tensorrt
Inference backend config: precision='fp16' use_dynamic_shapes=True dynamic_shapes={'x': [[1, 3, 48, 160], [1, 3, 48, 320], [8, 3, 48, 3200]]}
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(719)::CreateTrtEngineFromOnnx Detect serialized TensorRT Engine file in /root/.paddlex/official_models/PP-OCRv5_server_rec/.cache/tensorrt/trt_serialized.trt, will load it directly.
[WARNING] ultra_infer/runtime/backends/tensorrt/utils.cc(40)::Update [New Shape Out of Range] input name: x, shape: [8, 3, 48, 3200], The shape range before: min_shape=[-1, 3, 48, -1], max_shape=[-1, 3, 48, -1].
[WARNING] ultra_infer/runtime/backends/tensorrt/utils.cc(52)::Update [New Shape Out of Range] The updated shape range now: min_shape=[8, 3, 48, 3200], max_shape=[8, 3, 48, 3200].
[WARNING] ultra_infer/runtime/backends/tensorrt/utils.cc(40)::Update [New Shape Out of Range] input name: x, shape: [1, 3, 48, 160], The shape range before: min_shape=[8, 3, 48, 3200], max_shape=[8, 3, 48, 3200].
[WARNING] ultra_infer/runtime/backends/tensorrt/utils.cc(52)::Update [New Shape Out of Range] The updated shape range now: min_shape=[1, 3, 48, 160], max_shape=[8, 3, 48, 3200].
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(108)::LoadTrtCache Build TensorRT Engine from cache file: /root/.paddlex/official_models/PP-OCRv5_server_rec/.cache/tensorrt/trt_serialized.trt with shape range information as below,
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(111)::LoadTrtCache Input name: x, shape=[-1, 3, 48, -1], min=[1, 3, 48, 160], max=[8, 3, 48, 3200]
[ERROR] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(239)::log 3: [runtime.cpp::~Runtime::346] Error Code 3: API Usage Error (Parameter check failed at: runtime/rt/runtime.cpp::~Runtime::346, condition: mEngineCounter.use_count() == 1. Destroying a runtime before destroying deserialized engines created by the runtime leads to undefined behavior.
)
[INFO] ultra_infer/runtime/runtime.cc(320)::CreateTrtBackend Runtime initialized with Backend::TRT in Device::GPU.
[ERROR] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(445)::SetInputs TRTBackend SetInputs not find name:x
--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
No stack trace in paddle, may be caused by external reasons.
----------------------
Error Message Summary:
----------------------
FatalError: `Process abort signal` is detected by the operating system.
[TimeInfo: *** Aborted at 1757387445 (unix time) try "date -d @1757387445" if you are using GNU date ***]
[SignalInfo: *** SIGABRT (@0x2556) received by PID 9558 (TID 0x7f96c25bd740) from PID 9558 ***]
Aborted (core dumped) 请问这种情况一般是什么原因呢?有无解决方案? |
Beta Was this translation helpful? Give feedback.
0 replies
-
为啥高性能推理的速度还不如不开高性能的速度快呢,是我哪里配置出问题啦? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
latest/version3.x/deployment/high_performance_inference
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://www.paddleocr.ai/latest/version3.x/deployment/high_performance_inference.html
Beta Was this translation helpful? Give feedback.
All reactions