PaddleX/latest/pipeline_deploy/high_performance_inference #2698

Traceback (most recent call last):
  File "D:\TestProject\pythonProject3\PaddleX\test.py", line 3, in <module>
    pipeline = create_pipeline(
               ^^^^^^^^^^^^^^^^
  File "D:\TestProject\pythonProject3\PaddleX\paddlex\inference\pipelines\__init__.py", line 119, in create_pipeline
    return create_pipeline_from_config(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\TestProject\pythonProject3\PaddleX\paddlex\inference\pipelines\__init__.py", line 70, in create_pipeline_from_config
    pipeline_name = config["Global"]["pipeline_name"]
                    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
KeyError: 'pipeline_name'

2 replies

LounesAl Feb 21, 2025

It looks like you’re trying to use the config module inside the pipeline configuration. A good approach would be to copy one of the existing pipeline configs from .venv/lib/python3.xx/site-packages/paddlex/configs/pipelines and then modify the module_name accordingly.

You might also find this tutorial helpful:
🔗 OCR Pipeline Integration

ItsDia Feb 21, 2025

Thanks a lot! i will try it.

qianbin1989228 · 2025-04-11T07:26:22Z

qianbin1989228
Apr 11, 2025 — with giscus

搞得太复杂了，产品经理脑子有坑吗，这都啥呀原来的套件都不维护了，搞这种大一统的东西，搞得不伦不类。我就想在windows项目中集成使用dll，好家伙，要搞docker了现在，搞成客户端和服务端通信模式了，佩服我几百台电脑部署都给人家客户先装个docker吗？

1 reply

Bobholamovic Apr 28, 2025 — with giscus
Maintainer

你好，抱歉产品的不完善给你带来了不便，也感谢你指出这些问题！作为PaddleX一个普通的研发，下面是我对这些问题的一些看法：

关于套件和PaddleX：其实现在套件也还是有在维护的，issue答疑、bug修复这些都有在进行。只是因为人力等一些原因，我们无法再像以前那样把每个套件作为一个单独的产品来研发；新的功能可能更多会收口到一个统一的地方，也就是PaddleX。“统一”往往意味着“低成本”：我们希望不仅是开发上的低成本；在使用上，历史上不同套件的用户在PaddleX也能得到一致的体验，减少切换不同任务时的适应成本。
目前在C++本地部署方面，PaddleX没有给出本地服务化部署以外的方案。这一点和人力原因也不无关系；在短期内，我们可能暂时只有精力聚焦在一些更多用户关注的高频场景。不过，关于“几百台电脑部署都给人家客户先装个docker”，这一点我不能认同。如果采用服务化部署的方案，或许批量部署的策略也需要按照服务化部署的思路来——在每台（性能较强的）机器上安装一个独立的软件，是传统的离线桌面软件的思路；设置一台或多台（性能较强的）服务器，在这些机器上部署推理服务，然后在其余的（可能性能很弱的）机器上调用服务，这是服务端-客户端分离的软件的思路。实际上，如果使用基础的服务化部署方案，也是不需要用到Docker的～

诚然，PaddleX目前不太完善，但我们希望可以做得更好。罗马不是一日建成的，一个产品常常需要不断地根据用户反馈增加新的功能，这可能是一个相对长期的过程。也希望你能给我们一些耐心，多给我们一些反馈（比如这次提到的），我相信我们能不断打磨这个产品，做到更好。

ganzhiming · 2025-04-15T02:57:05Z

ganzhiming
Apr 15, 2025 — with giscus

高性能推理插件支持在windows11系统上安装使用吗？

1 reply

Bobholamovic Apr 28, 2025 — with giscus
Maintainer

对于PaddleX 3.0.0rc1版本，可以使用WSL或者Docker，暂时没有原生Windows支持

linhongping98 · 2025-05-22T06:07:55Z

linhongping98
May 22, 2025 — with giscus

您好，想了解一下后续有类似FastDeploy 之类的方案更进吗，因为我用再工业领域，部署的时候真的希望开箱即用，类似的这样的部署非常不便，而且对于硬件很极限,常部署在intel核显上。连4060都舍不得上。

3 replies

Bobholamovic May 22, 2025
Maintainer

请问你指的是C++本地推理嘛？

linhongping98 May 22, 2025

请问你指的是C++本地推理嘛？
是的，用了Paddle Inference，Paddle Lite，但是总感觉紧没有那种丝滑感，文档从上到下有种不连续的感觉，作为新手来说，实际用在工业环境下第一个感觉就各类工具全，但是好像差点意思，

Bobholamovic May 22, 2025
Maintainer

目前C++本地推理能力还没有建设，感谢你的建议，我们会讨论这些需求，并且可能在之后的版本推出相关解决方案～

wangxy-hub · 2025-06-19T09:10:06Z

wangxy-hub
Jun 19, 2025 — with giscus

高性能能推理支持的模型列表在哪里？

1 reply

Bobholamovic Jun 19, 2025
Maintainer

paddlex升级到 3.0.0-rc1 版本后，理论上PaddleX支持的所有使用静态图格式的模型都支持高性能推理，但根据实际情况，有可能开了高性能推理之后无法加速。可以参考FAQ中的说明：

XuXiaoDanGao · 2025-07-26T05:17:29Z

XuXiaoDanGao
Jul 26, 2025 — with giscus

目前 PaddleX 官方仅提供 CUDA 11.8 + cuDNN 8.9 的预编译包。CUDA 12 已经在支持中。

预计什么时候发布？

1 reply

Bobholamovic Jul 28, 2025
Maintainer

目前develop分支已经支持，预计在8月中旬的3.2版本中会体现，如果着急使用的话也可以先尝试安装develop分支的paddlex

songmingjun3 · 2025-07-31T01:14:32Z

songmingjun3
Jul 31, 2025 — with giscus

使用高性能推理出现下面ERROR，推理结果能正常输出，paddlex3.1+paddlepaddle3.0.0
[ERROR] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(239)::log 3: [runtime.cpp::~Runtime::346] Error Code 3: API Usage Error (Parameter check failed at: runtime/rt/runtime.cpp::~Runtime::346, condition: mEngineCounter.use_count() == 1. Destroying a runtime before destroying deserialized engines created by the runtime leads to undefined behavior.
)

1 reply

Bobholamovic Jul 31, 2025
Maintainer

这是底层框架的一些bug导致的，如果这个问题不影响正常推理流程的话，可以忽略～我们在文档的FAQ部分也做了一些解释：https://paddlepaddle.github.io/PaddleX/latest/pipeline_deploy/high_performance_inference.html#3

lipx2022 · 2025-08-04T02:55:17Z

lipx2022
Aug 4, 2025 — with giscus

你好，我只是想把paddleOCR 部署到上位机软件里，用来做零件上文字识别，产品生产周期很短，，我们需要在极短的时间内完成OCR模型的本地推理，考虑使用PaddleOCRV5, 转成ONNX模型，最后再转成tensorRT模型，使用tensorRT对OCR模型推理加速，不使用docker和服务的形式, 只是转tensorRT后封装一个调用tensorRT OCR模型的库（python语言开发），以上本地使用python调用tensorRT ocr模型能做到吗，这个PaddleOCR V5 是不是可以转tensorRT(在不依赖你们的高性能推理部署框架的情况下)？

3 replies

Bobholamovic Aug 4, 2025
Maintainer

以上需求可以通过PaddleX高性能推理插件实现，这种方式可以不使用docker，也无需服务化。

lipx2022 Aug 4, 2025 — with giscus

windows10 可以不使用Docker部署吗？

Bobholamovic Aug 4, 2025
Maintainer

对于Windows系统，如果不使用Docker，也可以考虑在WSL中执行，不过目前确实不支持原生Windows

AndrewHu2024 · 2025-08-11T07:31:57Z

AndrewHu2024
Aug 11, 2025 — with giscus

高性能推理+高稳定性服务化推理可以结合起来部署吗？用哪个容器镜像？

5 replies

Bobholamovic Aug 11, 2025
Maintainer

可以，用高稳定性服务化部署的镜像，设置PADDLEX_HPS_USE_HPIP环境变量开关为1

AndrewHu2024 Aug 12, 2025 — with giscus

用高稳定性服务化部署，设置PADDLEX_HPS_USE_HPIP环境变量开关为1

后，出现如下错误。

==========
== CUDA ==

CUDA Version 11.8.0

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

I0811 06:57:06.016984 7 metrics.cc:298] Collecting metrics for GPU 0: NVIDIA GeForce RTX 4090
I0811 06:57:06.017065 7 metrics.cc:298] Collecting metrics for GPU 1: NVIDIA RTX A6000
I0811 06:57:06.152832 7 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x74262c000000' with size 268435456
I0811 06:57:06.161172 7 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0811 06:57:06.161176 7 cuda_memory_manager.cc:105] CUDA memory pool is created on device 1 with size 67108864
W0811 06:57:06.260422 7 server.cc:207] failed to enable peer access for some device pairs
I0811 06:57:06.264693 7 model_repository_manager.cc:1022] loading: ocr:1
I0811 06:57:06.397062 7 python.cc:1880] TRITONBACKEND_ModelInstanceInitialize: ocr_0_0 (GPU device 1)
[ INFO] [2025-08-11 06:57:07,803] [] [] - Triton model config: {'name': 'ocr', 'platform': '', 'backend': 'python', 'version_policy': {'latest': {'num_versions': 1}}, 'max_batch_size': 1, 'input': [{'name': 'input', 'data_type': 'TYPE_STRING', 'format': 'FORMAT_NONE', 'dims': [1], 'is_shape_tensor': False, 'allow_ragged_batch': False}], 'output': [{'name': 'output', 'data_type': 'TYPE_STRING', 'dims': [1], 'label_filename': '', 'is_shape_tensor': False}], 'batch_input': [], 'batch_output': [], 'optimization': {'priority': 'PRIORITY_DEFAULT', 'input_pinned_memory': {'enable': True}, 'output_pinned_memory': {'enable': True}, 'gather_kernel_buffer_threshold': 0, 'eager_batching': False}, 'instance_group': [{'name': 'ocr_0', 'kind': 'KIND_GPU', 'count': 2, 'gpus': [1], 'secondary_devices': [], 'profile': [], 'passive': False, 'host_policy': ''}], 'default_model_filename': '', 'cc_model_filenames': {}, 'metric_tags': {}, 'parameters': {}, 'model_warmup': []}
[ INFO] [2025-08-11 06:57:07,803] [] [] - Input names: ['input']
[ INFO] [2025-08-11 06:57:07,803] [] [] - Output names: ['output']
Creating model: ('PP-OCRv5_server_det', None)
Using official model (PP-OCRv5_server_det), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
INFO:root:Create a symbolic link pointing to /paddlex/py310/lib/python3.10/site-packages/ultra_infer/libs/third_libs/tensorrt/lib/libnvcaffe_parser.so.8 named /paddlex/py310/lib/python3.10/site-packages/ultra_infer/libs/third_libs/tensorrt/lib/libnvcaffe_parser.so.
INFO:root:Create a symbolic link pointing to /paddlex/py310/lib/python3.10/site-packages/ultra_infer/libs/third_libs/tensorrt/lib/libnvinfer_plugin.so.8 named /paddlex/py310/lib/python3.10/site-packages/ultra_infer/libs/third_libs/tensorrt/lib/libnvinfer_plugin.so.
INFO:root:Create a symbolic link pointing to /paddlex/py310/lib/python3.10/site-packages/ultra_infer/libs/third_libs/tensorrt/lib/libnvinfer.so.8 named /paddlex/py310/lib/python3.10/site-packages/ultra_infer/libs/third_libs/tensorrt/lib/libnvinfer.so.
INFO:root:Create a symbolic link pointing to /paddlex/py310/lib/python3.10/site-packages/ultra_infer/libs/third_libs/tensorrt/lib/libnvonnxparser.so.8 named /paddlex/py310/lib/python3.10/site-packages/ultra_infer/libs/third_libs/tensorrt/lib/libnvonnxparser.so.
INFO:root:Create a symbolic link pointing to /paddlex/py310/lib/python3.10/site-packages/ultra_infer/libs/third_libs/tensorrt/lib/libnvparsers.so.8 named /paddlex/py310/lib/python3.10/site-packages/ultra_infer/libs/third_libs/tensorrt/lib/libnvparsers.so.
0811 06:57:09.631010 51 pb_stub.cc:352] Failed to initialize Python stub: AssertionError: tensorrt

At:
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/utils/hpi.py(239): suggest_inference_backend_and_config
/paddlex/py310/lib/python3.10/site-packages/paddlex/utils/deps.py(135): _wrapper
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/models/common/static_infer.py(619): _determine_backend_and_config
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/models/common/static_infer.py(570): init
/paddlex/py310/lib/python3.10/site-packages/paddlex/utils/deps.py(148): _wrapper
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/models/base/predictor/base_predictor.py(242): create_static_infer
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/models/text_detection/predictor.py(77): _build
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/models/text_detection/predictor.py(57): init
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/models/init.py(77): create_predictor
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/pipelines/base.py(107): create_model
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/pipelines/ocr/pipeline.py(114): init
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/pipelines/_parallel.py(158): _create_internal_pipeline
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/pipelines/_parallel.py(103): init
/paddlex/py310/lib/python3.10/site-packages/paddlex/utils/deps.py(195): _wrapper
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/pipelines/init.py(165): create_pipeline
/paddlex/py310/lib/python3.10/site-packages/paddlex_hps_server/base_model.py(134): _create_pipeline
/paddlex/py310/lib/python3.10/site-packages/paddlex_hps_server/base_model.py(62): initialize
/paddlex/var/paddlex_model_repo/ocr/1/model.py(18): initialize

E0811 06:57:09.639499 7 model_repository_manager.cc:1186] failed to load 'ocr' version 1: Internal: AssertionError: tensorrt

At:
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/utils/hpi.py(239): suggest_inference_backend_and_config
/paddlex/py310/lib/python3.10/site-packages/paddlex/utils/deps.py(135): _wrapper
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/models/common/static_infer.py(619): _determine_backend_and_config
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/models/common/static_infer.py(570): init
/paddlex/py310/lib/python3.10/site-packages/paddlex/utils/deps.py(148): _wrapper
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/models/base/predictor/base_predictor.py(242): create_static_infer
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/models/text_detection/predictor.py(77): _build
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/models/text_detection/predictor.py(57): init
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/models/init.py(77): create_predictor
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/pipelines/base.py(107): create_model
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/pipelines/ocr/pipeline.py(114): init
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/pipelines/_parallel.py(158): _create_internal_pipeline
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/pipelines/_parallel.py(103): init
/paddlex/py310/lib/python3.10/site-packages/paddlex/utils/deps.py(195): _wrapper
/paddlex/py310/lib/python3.10/site-packages/paddlex/inference/pipelines/init.py(165): create_pipeline
/paddlex/py310/lib/python3.10/site-packages/paddlex_hps_server/base_model.py(134): _create_pipeline
/paddlex/py310/lib/python3.10/site-packages/paddlex_hps_server/base_model.py(62): initialize
/paddlex/var/paddlex_model_repo/ocr/1/model.py(18): initialize

I0811 06:57:09.639683 7 server.cc:522]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0811 06:57:09.639719 7 server.cc:549]
+---------+------+--------+
| Backend | Path | Config |
+---------+------+--------+
+---------+------+--------+

I0811 06:57:09.639786 7 server.cc:592]
+-------+---------+-----------------------------------------------------------------------------------------------------------------------------------+
| Model | Version | Status |
+-------+---------+-----------------------------------------------------------------------------------------------------------------------------------+
| ocr | 1 | UNAVAILABLE: Internal: AssertionError: tensorrt |
| | | |
| | | At: |
| | | /paddlex/py310/lib/python3.10/site-packages/paddlex/inference/utils/hpi.py(239): suggest_inference_backend_and_config |
| | | /paddlex/py310/lib/python3.10/site-packages/paddlex/utils/deps.py(135): _wrapper |
| | | /paddlex/py310/lib/python3.10/site-packages/paddlex/inference/models/common/static_infer.py(619): _determine_backend_and_config |
| | | /paddlex/py310/lib/python3.10/site-packages/paddlex/inference/models/common/static_infer.py(570): init |
| | | /paddlex/py310/lib/python3.10/site-packages/paddlex/utils/deps.py(148): _wrapper |
| | | /paddlex/py310/lib/python3.10/site-packages/paddlex/inference/models/base/predictor/base_predictor.py(242): create_static_infer |
| | | /paddlex/py310/lib/python3.10/site-packages/paddlex/inference/models/text_detection/predictor.py(77): _build |
| | | /paddlex/py310/lib/python3.10/site-packages/paddlex/inference/models/text_detection/predictor.py(57): init |
| | | /paddlex/py310/lib/python3.10/site-packages/paddlex/inference/models/init.py(77): create_predictor |
| | | /paddlex/py310/lib/python3.10/site-packages/paddlex/inference/pipelines/base.py(107): create_model |
| | | /paddlex/py310/lib/python3.10/site-packages/paddlex/inference/pipelines/ocr/pipeline.py(114): init |
| | | /paddlex/py310/lib/python3.10/site-packages/paddlex/inference/pipelines/_parallel.py(158): _create_internal_pipeline |
| | | /paddlex/py310/lib/python3.10/site-packages/paddlex/inference/pipelines/_parallel.py(103): init |
| | | /paddlex/py310/lib/python3.10/site-packages/paddlex/utils/deps.py(195): _wrapper |
| | | /paddlex/py310/lib/python3.10/site-packages/paddlex/inference/pipelines/init.py(165): create_pipeline |
| | | /paddlex/py310/lib/python3.10/site-packages/paddlex_hps_server/base_model.py(134): _create_pipeline |
| | | /paddlex/py310/lib/python3.10/site-packages/paddlex_hps_server/base_model.py(62): initialize |
| | | /paddlex/var/paddlex_model_repo/ocr/1/model.py(18): initialize |
+-------+---------+-----------------------------------------------------------------------------------------------------------------------------------+

I0811 06:57:09.639935 7 tritonserver.cc:1920]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.15.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_dat |
| | a statistics |
| model_repository_path[0] | /paddlex/var/paddlex_model_repo |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| cuda_memory_pool_byte_size{1} | 67108864 |
| response_cache_byte_size | 0 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0811 06:57:09.639964 7 server.cc:252] Waiting for in-flight requests to complete.
I0811 06:57:09.639972 7 server.cc:267] Timeout 30: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models

Bobholamovic Aug 12, 2025
Maintainer

请问使用的是最新版本的镜像吗？

AndrewHu2024 Aug 12, 2025 — with giscus

是的，就是高稳定服务部署文档提到的镜像。 docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlex/hps:paddlex3.1-gpu

AndrewHu2024 Aug 12, 2025 — with giscus

好像是我弄错了。

ganzhiming · 2025-08-21T10:21:15Z

ganzhiming
Aug 21, 2025 — with giscus

我很好奇，连最简单的通用图像分类网络，都无法加载高性能插件，我都怀疑是不是我的配置有问题，还是这个高性能插件就只支持个例的模型，我测试几个产线，都是报The Paddle Inference backend is selected with the default configuration. This may not provide optimal performance
通用图像分类产线加载日志如下:
λ 335394c1debc /home paddlex --serve --pipeline image_classification --device gpu:0 --host 0.0.0.0 --port 8010 --use_hpip
Creating model: ('PP-LCNet_x0_5', None)
Using official model (PP-LCNet_x0_5), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Connecting to https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-LCNet_x0_5_infer.tar ...
Downloading PP-LCNet_x0_5_infer.tar ...
[==================================================] 100.00%
Extracting PP-LCNet_x0_5_infer.tar
[==================================================] 100.00%
grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
The Paddle Inference backend is selected with the default configuration. This may not provide optimal performance.
Using Paddle Inference backend
Paddle predictor option: device_type: gpu, device_id: 0, run_mode: paddle, trt_dynamic_shapes: {'x': [[1, 3, 224, 224], [1, 3, 224, 224], [8, 3, 224, 224]]}, cpu_threads: 10, delete_pass: [], enable_new_ir: True, enable_cinn: False, trt_cfg_setting: {}, trt_use_dynamic_shapes: True, trt_collect_shape_range_info: True, trt_discard_cached_shape_range_info: False, trt_dynamic_shape_input_data: None, trt_shape_range_info_path: None, trt_allow_rebuild_at_runtime: True, mkldnn_cache_capacity: 10
INFO: Started server process [4305]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8010 (Press CTRL+C to quit)

1 reply

Bobholamovic Aug 21, 2025
Maintainer

在服务化部署的讨论区回复你了～

qianbin1989228 · 2025-08-22T01:09:44Z

qianbin1989228
Aug 22, 2025

自从出了这个3.0之后的什么管线功能，我就不再用paddlepaddle了 ------------------ Original ------------------From: 牛马 ***@***.***>Date: Thu,Aug 21,2025 6:21 PMTo: PaddlePaddle/PaddleX ***@***.***>Cc: Bin Qian ***@***.***>, Comment ***@***.***>Subject: Re: [PaddlePaddle/PaddleX]PaddleX/latest/pipeline_deploy/high_performance_inference (Discussion #2698) 我很好奇，连最简单的通用图像分类网络，都无法加载高性能插件，我都怀疑是不是我的配置有问题，还是这个高性能插件就只支持个例的模型，我测试几个产线，都是报The Paddle Inference backend is selected with the default configuration. This may not provide optimal performance 通用图像分类产线加载日志如下: λ 335394c1debc /home paddlex --serve --pipeline image_classification --device gpu:0 --host 0.0.0.0 --port 8010 --use_hpip Creating model: ('PP-LCNet_x0_5', None) Using official model (PP-LCNet_x0_5), the model files will be automatically downloaded and saved in /root/.paddlex/official_models. Connecting to https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-LCNet_x0_5_infer.tar ... Downloading PP-LCNet_x0_5_infer.tar ... [==================================================] 100.00% Extracting PP-LCNet_x0_5_infer.tar [==================================================] 100.00% grep: warning: GREP_OPTIONS is deprecated; please use an alias or script The Paddle Inference backend is selected with the default configuration. This may not provide optimal performance. Using Paddle Inference backend Paddle predictor option: device_type: gpu, device_id: 0, run_mode: paddle, trt_dynamic_shapes: {'x': [[1, 3, 224, 224], [1, 3, 224, 224], [8, 3, 224, 224]]}, cpu_threads: 10, delete_pass: [], enable_new_ir: True, enable_cinn: False, trt_cfg_setting: {}, trt_use_dynamic_shapes: True, trt_collect_shape_range_info: True, trt_discard_cached_shape_range_info: False, trt_dynamic_shape_input_data: None, trt_shape_range_info_path: None, trt_allow_rebuild_at_runtime: True, mkldnn_cache_capacity: 10 INFO: Started server process [4305] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8010 (Press CTRL+C to quit) —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***> [ { ***@***.***": "http://schema.org", ***@***.***": "EmailMessage", "potentialAction": { ***@***.***": "ViewAction", "target": "#2698 (comment)", "url": "#2698 (comment)", "name": "View Discussion" }, "description": "View this Discussion on GitHub", "publisher": { ***@***.***": "Organization", "name": "GitHub", "url": "https://github.com" } } ]

0 replies

joey-bing-614 · 2025-09-01T02:09:48Z

joey-bing-614
Sep 1, 2025 — with giscus

1，高性能推理感觉速度差不多呀，我用的是ocr产线，纯cpu配置8核32G内存，安装 Paddle2ONNX 插件，有什么好的建议，目前识别一张100字的英文图片，10秒以上了，每次初始化ocr都花3~5秒。2，服务化部署后支持切换ocr识别模型吗，我想根据前端传的类型来具体初始化是英文还是中文模型，目前好像做不到。3，我通过python脚本方式集成，但是每次识别都必须初始化一次OCR，这是什么原因？如果在函数外先初始化一次，再传入图像识别，第一次识别成功，第二次就失败，第三次就成功。所以目前每次传过来图像，我都先初始化一次，再识别。

2 replies

Bobholamovic Sep 1, 2025
Maintainer

可以提供一下日志，我们看看是否选择了最佳后端。另外，初始化时间不应该算在推理时间里，应该把初始化和推理分离，初始化只在程序开始时执行一次。
部署后不支持切换模型，可以部署两个服务，根据前端传来的字段选择使用英文还是中文服务。如果是自己编写服务，可以考虑创建2个产线实例。需要指出的是，默认的模型是同时支持中、英、日文的。
产线对象只需要初始化一次，后续不断调用同一个对象的predict方法即可。

joey-bing-614 Sep 1, 2025 — with giscus

好的，关于第三点，我测试了好多次，把产线对象只初始化一次，后续不断调用同一个对象的predict，第一次调用成功，第二次就报错，我是paddleocr==3.2.0，paddlepaddle==3.1.0，paddlex==3.2.0看我日志：
��Ϣ: ��ṩ��ģʽ�޷��ҵ��ļ��
D:\project\data_service\ai-schedule.venv\Lib\site-packages\paddle\utils\cpp_extension\extension_utils.py:715: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
warnings.warn(warning_message)
Creating model: ('PP-LCNet_x1_0_textline_ori', None)
Using official model (PP-LCNet_x1_0_textline_ori), the model files will be automatically downloaded and saved in C:\Users\renrui.paddlex\official_models.
Processing 5 items: 100%|██████████| 5.00/5.00 [00:00<00:00, 9.62it/s]
Creating model: ('PP-OCRv5_server_det', None)
Using official model (PP-OCRv5_server_det), the model files will be automatically downloaded and saved in C:\Users\renrui.paddlex\official_models.
Processing 5 items: 100%|██████████| 5.00/5.00 [00:00<00:00, 7.42it/s]
Creating model: ('PP-OCRv5_mobile_rec', None)
Using official model (PP-OCRv5_mobile_rec), the model files will be automatically downloaded and saved in C:\Users\renrui.paddlex\official_models.
Processing 5 items: 100%|██████████| 5.00/5.00 [00:00<00:00, 5.86it/s]
[2025-09-01 14:49:48,904] [ WARNING] _internal.py:97 - * Debugger is active!
[2025-09-01 14:49:48,915] [ INFO] _internal.py:97 - * Debugger PIN: 998-155-055
[2025-09-01 14:49:57,415] [ INFO] _internal.py:97 - 127.0.0.1 - - [01/Sep/2025 14:49:57] "POST /paddleocr HTTP/1.1" 200 -
Error detecting text: Unknown exception
Traceback (most recent call last):
File "D:\project\data_service\ai-schedule\app\paddleocr_service.py", line 39, in detect_text
result = ocr.predict(image_array)
^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\project\data_service\ai-schedule.venv\Lib\site-packages\paddleocr_pipelines\ocr.py", line 213, in predict
return list(
^^^^^
File "D:\project\data_service\ai-schedule.venv\Lib\site-packages\paddlex\inference\pipelines_parallel.py", line 129, in predict
yield from self._pipeline.predict(
File "D:\project\data_service\ai-schedule.venv\Lib\site-packages\paddlex\inference\pipelines\ocr\pipeline.py", line 357, in predict
det_results = list(
^^^^^
File "D:\project\data_service\ai-schedule.venv\Lib\site-packages\paddlex\inference\models\base\predictor\base_predictor.py", line 219, in call
yield from self.apply(input, **kwargs)
File "D:\project\data_service\ai-schedule.venv\Lib\site-packages\paddlex\inference\models\base\predictor\base_predictor.py", line 277, in apply
prediction = self.process(batch_data, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\project\data_service\ai-schedule.venv\Lib\site-packages\paddlex\inference\models\text_detection\predictor.py", line 105, in process
batch_preds = self.infer(x=x)
^^^^^^^^^^^^^^^
File "D:\project\data_service\ai-schedule.venv\Lib\site-packages\paddlex\inference\models\common\static_infer.py", line 297, in call
pred = self.infer(x)
^^^^^^^^^^^^^
File "D:\project\data_service\ai-schedule.venv\Lib\site-packages\paddlex\inference\models\common\static_infer.py", line 260, in call
self.predictor.run()
RuntimeError: Unknown exception
[2025-09-01 14:50:09,736] [ INFO] _internal.py:97 - 127.0.0.1 - - [01/Sep/2025 14:50:09] "POST /paddleocr HTTP/1.1" 200 -
[2025-09-01 14:50:23,115] [ INFO] _internal.py:97 - 127.0.0.1 - - [01/Sep/2025 14:50:23] "POST /paddleocr HTTP/1.1" 200 -。
我是连着3次，第一次成功，第二次报错，第三次成功。都是用的同一个图片，难道跟ccache这个包有关系？
关于第一点，首次启动构建确实很花时间，大概花了半小时。日志如下：Creating model: ('PP-LCNet_x1_0_doc_ori', None)
Using official model (PP-LCNet_x1_0_doc_ori), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Downloading [config.json]: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.50k/2.50k [00:00<00:00, 6.62kB/s]
Downloading [README.md]: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6.67k/6.67k [00:00<00:00, 14.1kB/s]
Downloading [inference.json]: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 102k/102k [00:00<00:00, 216kB/s]
Downloading [inference.yml]: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 766/766 [00:00<00:00, 1.79kB/s]
Downloading [inference.pdiparams]: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6.44M/6.44M [00:04<00:00, 1.39MB/s]
Processing 5 items: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5.00/5.00 [00:05<00:00, 1.12s/it]
Automatically converting PaddlePaddle model to ONNX format
Inference backend: openvino
Inference backend config: cpu_num_threads=10
[INFO] ultra_infer/runtime/backends/openvino/ov_backend.cc(371)::InitFromOnnx number of streams:1.
[INFO] ultra_infer/runtime/backends/openvino/ov_backend.cc(375)::InitFromOnnx affinity:YES.
[INFO] ultra_infer/runtime/backends/openvino/ov_backend.cc(387)::InitFromOnnx Compile OpenVINO model on device_name:CPU.
[INFO] ultra_infer/runtime/runtime.cc(283)::CreateOpenVINOBackend Runtime initialized with Backend::OPENVINO in Device::CPU.
Creating model: ('UVDoc', None)
Using official model (UVDoc), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Downloading [README.md]: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4.36k/4.36k [00:00<00:00, 10.5kB/s]
Downloading [config.json]: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.45k/1.45k [00:00<00:00, 2.74kB/s]
Downloading [inference.json]: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 187k/187k [00:00<00:00, 361kB/s]
Downloading [inference.yml]: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 330/330 [00:00<00:00, 646B/s]
Downloading [inference.pdiparams]: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30.6M/30.6M [00:25<00:00, 1.24MB/s]
Processing 5 items: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5.00/5.00 [00:26<00:00, 5.36s/it]
The Paddle Inference backend is selected with the default configuration. This may not provide optimal performance.██████████████████████████████████████████████████████████████████████████████████████████| 30.6M/30.6M [00:25<00:00, 1.27MB/s]
Using Paddle Inference backend
Paddle predictor option: device_type: cpu, device_id: None, run_mode: paddle, trt_dynamic_shapes: {'img': [[1, 3, 128, 64], [1, 3, 256, 128], [8, 3, 512, 256]]}, cpu_threads: 10, delete_pass: [], enable_new_ir: True, enable_cinn: False, trt_cfg_setting: {}, trt_use_dynamic_shapes: True, trt_collect_shape_range_info: True, trt_discard_cached_shape_range_info: False, trt_dynamic_shape_input_data: None, trt_shape_range_info_path: None, trt_allow_rebuild_at_runtime: True, mkldnn_cache_capacity: 10
Creating model: ('PP-LCNet_x1_0_textline_ori', None)
Using official model (PP-LCNet_x1_0_textline_ori), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Downloading [README.md]: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3.45k/3.45k [00:00<00:00, 8.02kB/s]
Downloading [config.json]: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.43k/2.43k [00:00<00:00, 5.08kB/s]
Downloading [inference.json]: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 102k/102k [00:00<00:00, 198kB/s]
Downloading [inference.yml]: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 735/735 [00:00<00:00, 1.49kB/s]
Downloading [inference.pdiparams]: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6.43M/6.43M [00:05<00:00, 1.29MB/s]
Processing 5 items: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5.00/5.00 [00:06<00:00, 1.23s/it]
The Paddle Inference backend is selected with the default configuration. This may not provide optimal performance.
Using Paddle Inference backend
Paddle predictor option: device_type: cpu, device_id: None, run_mode: mkldnn, trt_dynamic_shapes: {'x': [[1, 3, 80, 160], [1, 3, 80, 160], [8, 3, 80, 160]]}, cpu_threads: 10, delete_pass: [], enable_new_ir: True, enable_cinn: False, trt_cfg_setting: {}, trt_use_dynamic_shapes: True, trt_collect_shape_range_info: True, trt_discard_cached_shape_range_info: False, trt_dynamic_shape_input_data: None, trt_shape_range_info_path: None, trt_allow_rebuild_at_runtime: True, mkldnn_cache_capacity: 10
Creating model: ('PP-OCRv5_server_det', None)
Using official model (PP-OCRv5_server_det), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Downloading [config.json]: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.80k/2.80k [00:00<00:00, 6.48kB/s]
Downloading [README.md]: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15.5k/15.5k [00:00<00:00, 35.4kB/s]
Downloading [inference.json]: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 393k/393k [00:00<00:00, 784kB/s]
Downloading [inference.yml]: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 903/903 [00:00<00:00, 2.11kB/s]
Downloading [inference.pdiparams]: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 83.9M/83.9M [01:10<00:00, 1.24MB/s]
Processing 5 items: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5.00/5.00 [01:11<00:00, 14.4s/it]
The Paddle Inference backend is selected with the default configuration. This may not provide optimal performance.██████████████████████████████████████████████████████████████████████████████████████████| 83.9M/83.9M [01:10<00:00, 1.21MB/s]
Using Paddle Inference backend
Paddle predictor option: device_type: cpu, device_id: None, run_mode: mkldnn, trt_dynamic_shapes: {'x': [[1, 3, 32, 32], [1, 3, 736, 736], [1, 3, 4000, 4000]]}, cpu_threads: 10, delete_pass: [], enable_new_ir: True, enable_cinn: False, trt_cfg_setting: {}, trt_use_dynamic_shapes: True, trt_collect_shape_range_info: True, trt_discard_cached_shape_range_info: False, trt_dynamic_shape_input_data: None, trt_shape_range_info_path: None, trt_allow_rebuild_at_runtime: True, mkldnn_cache_capacity: 10
Creating model: ('PP-OCRv5_server_rec', None)
Using official model (PP-OCRv5_server_rec), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Downloading [README.md]: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15.5k/15.5k [00:00<00:00, 39.0kB/s]
Downloading [config.json]: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 344k/344k [00:00<00:00, 705kB/s]
Downloading [inference.json]: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 318k/318k [00:00<00:00, 533kB/s]
Downloading [inference.yml]: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 145k/145k [00:00<00:00, 282kB/s]
Downloading [inference.pdiparams]: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 80.5M/80.5M [01:09<00:00, 1.21MB/s]
Processing 5 items: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5.00/5.00 [01:10<00:00, 14.1s/it]
The Paddle Inference backend is selected with the default configuration. This may not provide optimal performance.
Using Paddle Inference backend
Paddle predictor option: device_type: cpu, device_id: None, run_mode: mkldnn, trt_dynamic_shapes: {'x': [[1, 3, 48, 160], [1, 3, 48, 320], [8, 3, 48, 3200]]}, cpu_threads: 10, delete_pass: [], enable_new_ir: True, enable_cinn: False, trt_cfg_setting: {}, trt_use_dynamic_shapes: True, trt_collect_shape_range_info: True, trt_discard_cached_shape_range_info: False, trt_dynamic_shape_input_data: None, trt_shape_range_info_path: None, trt_allow_rebuild_at_runtime: True, mkldnn_cache_capacity: 10
INFO: Started server process [1]
INFO: Waiting for application startup.
INFO: Application startup complete.
。然后我调用后，日志如下：Connecting to https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/fonts/simfang.ttf ...
Downloading simfang.ttf ...
[==================================================] 100.00%
INFO: 61.144.189.27:24063 - "POST /ocr HTTP/1.1" 200 OK

PaddleX/latest/pipeline_deploy/high_performance_inference #2698

Uh oh!

giscus[bot] bot Dec 20, 2024

PaddleX/latest/pipeline_deploy/high_performance_inference

Replies: 13 comments · 25 replies

Uh oh!

jiauy Dec 20, 2024 — with giscus

Uh oh!

han508 Jan 6, 2025 — with giscus

Uh oh!

jiauy Jan 7, 2025

Uh oh!

a31413510 Jan 13, 2025 — with giscus Collaborator

Uh oh!

Bobholamovic Apr 28, 2025 — with giscus Maintainer

Uh oh!

ItsDia Feb 5, 2025 — with giscus

Uh oh!

LounesAl Feb 21, 2025

Uh oh!

ItsDia Feb 21, 2025

Uh oh!

qianbin1989228 Apr 11, 2025 — with giscus

Uh oh!

Bobholamovic Apr 28, 2025 — with giscus Maintainer

Uh oh!

ganzhiming Apr 15, 2025 — with giscus

Uh oh!

Bobholamovic Apr 28, 2025 — with giscus Maintainer

Uh oh!

linhongping98 May 22, 2025 — with giscus

Uh oh!

Bobholamovic May 22, 2025 Maintainer

Uh oh!

linhongping98 May 22, 2025

Uh oh!

Bobholamovic May 22, 2025 Maintainer

Uh oh!

wangxy-hub Jun 19, 2025 — with giscus

Uh oh!

Bobholamovic Jun 19, 2025 Maintainer

Uh oh!

XuXiaoDanGao Jul 26, 2025 — with giscus

Uh oh!

Bobholamovic Jul 28, 2025 Maintainer

Uh oh!

songmingjun3 Jul 31, 2025 — with giscus

Uh oh!

Bobholamovic Jul 31, 2025 Maintainer

Uh oh!

lipx2022 Aug 4, 2025 — with giscus

Uh oh!

Bobholamovic Aug 4, 2025 Maintainer

Uh oh!

lipx2022 Aug 4, 2025 — with giscus

Uh oh!

Bobholamovic Aug 4, 2025 Maintainer

Uh oh!

AndrewHu2024 Aug 11, 2025 — with giscus

Uh oh!

Bobholamovic Aug 11, 2025 Maintainer

Uh oh!

AndrewHu2024 Aug 12, 2025 — with giscus

========== == CUDA ==

Uh oh!

Bobholamovic Aug 12, 2025 Maintainer

Uh oh!

AndrewHu2024 Aug 12, 2025 — with giscus

Uh oh!

AndrewHu2024 Aug 12, 2025 — with giscus

Uh oh!

ganzhiming Aug 21, 2025 — with giscus

Uh oh!

Bobholamovic Aug 21, 2025 Maintainer

giscus[bot]
bot Dec 20, 2024

Replies: 13 comments 25 replies

jiauy
Dec 20, 2024 — with giscus

a31413510 Jan 13, 2025 — with giscus
Collaborator

Bobholamovic Apr 28, 2025 — with giscus
Maintainer

ItsDia
Feb 5, 2025 — with giscus

qianbin1989228
Apr 11, 2025 — with giscus

Bobholamovic Apr 28, 2025 — with giscus
Maintainer

ganzhiming
Apr 15, 2025 — with giscus

Bobholamovic Apr 28, 2025 — with giscus
Maintainer

linhongping98
May 22, 2025 — with giscus

Bobholamovic May 22, 2025
Maintainer

Bobholamovic May 22, 2025
Maintainer

wangxy-hub
Jun 19, 2025 — with giscus

Bobholamovic Jun 19, 2025
Maintainer

XuXiaoDanGao
Jul 26, 2025 — with giscus

Bobholamovic Jul 28, 2025
Maintainer

songmingjun3
Jul 31, 2025 — with giscus

Bobholamovic Jul 31, 2025
Maintainer

lipx2022
Aug 4, 2025 — with giscus

Bobholamovic Aug 4, 2025
Maintainer

Bobholamovic Aug 4, 2025
Maintainer

AndrewHu2024
Aug 11, 2025 — with giscus

Bobholamovic Aug 11, 2025
Maintainer

==========
== CUDA ==

Bobholamovic Aug 12, 2025
Maintainer

ganzhiming
Aug 21, 2025 — with giscus

Bobholamovic Aug 21, 2025
Maintainer