-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Open
Labels
Module:PerformanceGeneral performance issuesGeneral performance issuesModule:PolygraphyIssues with PolygraphyIssues with PolygraphyModule:QuantizationIssues related to QuantizationIssues related to Quantization
Description
I used polygraphy run model.onnx --trt --fp16 --precision-constraints none --data-loader-script loader.py -v --validate --fail-fast to check fp16 onnx' outputs (model.onnx is fp32) and got the following error.
[I] Output Validation | Runners: ['trt-runner-N0-06/30/25-19:45:22']
[I] trt-runner-N0-06/30/25-19:45:22 | Validating output: pred_multipath (check_inf=True, check_nan=True)
[I] mean=nan, std-dev=nan, var=nan, median=nan, min=nan at (0, 0, 0, 0), max=nan at (0, 0, 0, 0), avg-magnitude=nan, p90=nan, p95=nan, p99=nan
[V] Could not generate histogram. Note: Error was: autodetected range of [nan, nan] is not finite
[V]
[E] NaN Detected | One or more NaNs were encountered in this output
[I] Note: Use -vv or set logging verbosity to EXTRA_VERBOSE to display locations of NaNs
[E] Inf Detected | One or more non-finite values were encountered in this output
[I] Note: Use -vv or set logging verbosity to EXTRA_VERBOSE to display non-finite values
[E] FAILED | Errors detected in output: pred_multipath
[I] trt-runner-N0-06/30/25-19:45:22 | Validating output: path_prob (check_inf=True, check_nan=True)
[I] mean=nan, std-dev=nan, var=nan, median=nan, min=nan at (0, 0), max=nan at (0, 0), avg-magnitude=nan, p90=nan, p95=nan, p99=nan
[V] ---- Values ----
[[nan nan nan nan nan]]
[V] Could not generate histogram. Note: Error was: autodetected range of [nan, nan] is not finite
[V]
[E] NaN Detected | One or more NaNs were encountered in this output
[E] Inf Detected | One or more non-finite values were encountered in this output
[E] FAILED | Errors detected in output: path_prob
[I] trt-runner-N0-06/30/25-19:45:22 | Validating output: pred_target_agent_attribute (check_inf=True, check_nan=True)
[I] mean=20.279, std-dev=30.123, var=907.39, median=3.0488, min=1.7881e-07 at (0, 3), max=80.562 at (0, 0), avg-magnitude=20.279, p90=63.444, p95=72.003, p99=78.851
[V] ---- Values ----
[[8.0562500e+01 5.2031250e+01 7.6293945e-05 1.7881393e-07 3.0488281e+00
4.5117188e+00 1.8007812e+00]]
[V] ---- Histogram ----
Bin Range | Num Elems | Visualization
(1.79e-07, 8.06) | 5 | ########################################
(8.06 , 16.1) | 0 |
(16.1 , 24.2) | 0 |
(24.2 , 32.2) | 0 |
(32.2 , 40.3) | 0 |
(40.3 , 48.3) | 0 |
(48.3 , 56.4) | 1 | ########
(56.4 , 64.4) | 0 |
(64.4 , 72.5) | 0 |
(72.5 , 80.6) | 1 | ########
[I] PASSED | Output: pred_target_agent_attribute is valid
[I] trt-runner-N0-06/30/25-19:45:22 | Validating output: pred_scores (check_inf=True, check_nan=True)
[I] mean=0.092712, std-dev=0, var=0, median=0.092712, min=0.092712 at (0,), max=0.092712 at (0,), avg-magnitude=0.092712, p90=0.092712, p95=0.092712, p99=0.092712
[V] ---- Values ----
[0.0927124]
[V] ---- Histogram ----
Bin Range | Num Elems | Visualization
(-0.407 , -0.307 ) | 0 |
(-0.307 , -0.207 ) | 0 |
(-0.207 , -0.107 ) | 0 |
(-0.107 , -0.00729) | 0 |
(-0.00729, 0.0927 ) | 0 |
(0.0927 , 0.193 ) | 1 | ########################################
(0.193 , 0.293 ) | 0 |
(0.293 , 0.393 ) | 0 |
(0.393 , 0.493 ) | 0 |
(0.493 , 0.593 ) | 0 |
[I] PASSED | Output: pred_scores is valid
[I] trt-runner-N0-06/30/25-19:45:22 | Validating output: pred_ttc (check_inf=True, check_nan=True)
[I] mean=4, std-dev=0, var=0, median=4, min=4 at (0,), max=4 at (0,), avg-magnitude=4, p90=4, p95=4, p99=4
[V] ---- Values ----
[4.]
[V] ---- Histogram ----
Bin Range | Num Elems | Visualization
(3.5, 3.6) | 0 |
(3.6, 3.7) | 0 |
(3.7, 3.8) | 0 |
(3.8, 3.9) | 0 |
(3.9, 4 ) | 0 |
(4 , 4.1) | 1 | ########################################
(4.1, 4.2) | 0 |
(4.2, 4.3) | 0 |
(4.3, 4.4) | 0 |
(4.4, 4.5) | 0 |
[I] PASSED | Output: pred_ttc is valid
[E] FAILED | Output Validation
I used polygraphy run model.onnx --trt --fp16 --precision-constraints none --data-loader-script loader.py -v --validate --fail-fast --trt-outputs mark all --save-outputs outputs.json to get layerwise outputs but got the following error:
[E] 2: [myelinBuilderUtils.cpp::operator()::751] Error Code 2: Internal Error ([ShapeHostToDeviceCopy 0] requires bool or uint8 I/O but node can not be handled by Myelin. Operation is not supported.)
[!] Invalid Engine. Please ensure the engine was built correctly
It says --trt outputs mark all may disable TensorRT's layer fusion, resulting in performance degradation or errors (such as Myelin optimizer errors). Can I get the layerwise outputs from trt model? How can I debug this Nan issue?
Metadata
Metadata
Assignees
Labels
Module:PerformanceGeneral performance issuesGeneral performance issuesModule:PolygraphyIssues with PolygraphyIssues with PolygraphyModule:QuantizationIssues related to QuantizationIssues related to Quantization