fix(inference): Synchronize cpu_threads configuration for HPI predictors#4972
fix(inference): Synchronize cpu_threads configuration for HPI predictors#4972SankaraVenkatRam wants to merge 5 commits intoPaddlePaddle:developfrom
Conversation
|
Thanks for your contribution! |
|
CI_XPU is failing during repository clone with a TLS handshake error on the XPU runner: |
|
The PR is still blocked due to the same CI failure on the XPU runner: This appears to be a TLS/network issue on the XPU CI runner rather than something introduced by this PR. |
|
The CI failure is occurring during git fetch of the PR branch. Logs show connection timeouts and proxy instability when accessing GitHub: This appears to be a CI runner network/proxy issue . Kindly requesting maintainers to check the runner connectivity. |
|
|
||
| def sync_threads(self): | ||
| if self._pp_option and self._pp_option.cpu_threads: | ||
| # If the user specified threads in the old system, |
There was a problem hiding this comment.
pp_option and hpi_config are primarily designed for two orthogonal features. The former is intended for configuring the Paddle Inference Engine (pp referring to PaddlePaddle), while the latter is used for High-Performance Inference (HPI) settings. Although pp_option can take effect when the HPI backend is set to Paddle Inference, I am concerned that updating the CPU thread configuration via pp_option in a way that impacts other backends (e.g., ONNX Runtime) may not be appropriate. Such behavior could potentially lead to confusion.
paddlex/inference/utils/pp_option.py
Outdated
| "device_type": device_type, | ||
| "device_id": device_id, | ||
| "cpu_threads": 10, | ||
| "cpu_threads": int(os.getenv("CPU_NUM_THREADS", 10)), |
There was a problem hiding this comment.
I would suggest adding the prefix PADDLEX_PDX to the environment variable.
88ecc8f to
8c4afeb
Compare
|
@Bobholamovic Thanks for the review! I've pushed a follow-up commit addressing both points: |
|
It looks like the CI_XPU job failed during the checkout step due to a TLS connection error when cloning the repository |
|
Sorry, there is something wrong with CI, could you please create a new PR to trigger CI? |
Summary
This PR fixes an issue where user-defined CPU thread settings (e.g.
cpu_threads=2) were ignored when using the High Performance Inference (HPI) backend.Although the value was correctly set in the legacy
PaddlePredictorOption, it was never propagated to HPI, which instead used its own default (usually 10 threads). This could lead to unexpected CPU overuse, especially in containers or limited environments.What changed
PaddlePredictorOptionandHPIConfigvia a newsync_threadsstep inBasePredictorCPU_NUM_THREADSenvironment variable in:pp_option.pyResult
cpu_threadsvalues are now respected by the HPI backendCPU_NUM_THREADSprovides a simple global override when no value is set in codeWhy it matters
This aligns user intent with actual runtime behavior and avoids silent over-allocation of CPU resources.