-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Open
Labels
performanceissues related to performance regressionsissues related to performance regressions
Description
Describe the issue
Title
Performance regression in ELU operator between v1.20.0 and v1.21.0 (≈220% slowdown, suspected Eigen update)
Description
We observed a significant performance regression in the ELU operator between onnxruntime v1.20.0 and v1.21.0.
Operator / Test Case Details
Operator
- Type: Elu
- Opset Version: 21
Input
- Name: X
- Shape: [2, 64, 28, 28] (4D tensor)
- Data type: float32
- Value range:
- Min: 0.100
- Max: 9.999
- Mean: 5.040
Output
- Name: output
- Shape: [2, 64, 28, 28] (same as input)
- Data type: float32
Attributes
- alpha: 1.0 (default value per ONNX specification)
- No additional attributes specified
Model Information
- IR Version: 10
- Opset Version: 21
Regression Magnitude
- Approximately 220% slowdown in v1.21.0 compared to v1.20.0
Reproducibility
- Reproducible across multiple runs and different environments
Suspected Cause
Based on commit history analysis, the regression is likely related to the Eigen update introduced in the following commit:
-
Suspected commit:
7c0c6fb (Eigen update)
While the ELU kernel implementation itself does not appear to have been directly modified, the Eigen update may have affected:
- vectorization behavior,
- math kernel selection, or
- underlying execution paths for float32 ELU computation.
To reproduce
- Download the attached ELU model and benchmark script.
- Run the benchmark using the following command:
python script.py ./elu 1.20.0 1.21.0
- ./elu: directory containing the ELU model
- 1.20.0: baseline (good) onnxruntime version
- 1.21.0: regressed (bad) onnxruntime version
-
Compare the reported latency between the two versions.
A significant slowdown (~220%) can be observed in v1.21.0.
Urgency
No response
Platform
Linux
OS Version
Ubuntu 24.04.3 LTS
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
ONNX Runtime API
Python
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
Model File
No response
Is this a quantized model?
Yes
Metadata
Metadata
Assignees
Labels
performanceissues related to performance regressionsissues related to performance regressions