Cask convolution isConsistent check failed

## Description

[bridgedepth.py](https://github.com/user-attachments/files/23471946/bridgedepth.py)

  我在使用trtexec的时候报错
> [!TIP]
[11/11/2025-11:33:53] [V] [TRT] /bridge_model/depth_head/output_conv1/Conv (CaskConvolution[0x80000009]) profiling completed in 0.0160466 seconds. Fastest Tactic: 0x999e005e3b016ea6 Time: 0.338071
[11/11/2025-11:33:53] [V] [TRT] Skipping CaskFlattenConvolution: No valid tactics for /bridge_model/depth_head/output_conv1/Conv
[11/11/2025-11:33:53] [V] [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 0x999e005e3b016ea6
[11/11/2025-11:33:53] [V] [TRT] =============== Computing costs for /bridge_model/depth_feature_down/Conv
[11/11/2025-11:33:53] [V] [TRT] *************** Autotuning format combination: Float((* 20480 E1),(* 160 E1),160,1) where E0=(- (MUL_ADD 16 (CEIL_DIV (+ height -15) 16) 16) height) E1=(CAST_F32_TO_I (FLOOR (MUL_ADD_F 2 (CAST_I_TO_F32 (MUL_ADD 4 (CEIL_DIV (MUL_ADD 7 (CEIL_DIV (+ (+ (MUL_ADD -16 (CEIL_DIV (+ E0 -15) 16) E0) height) -7) 8) -27) 14) 4)) 0))) -> Float((MUL_ADD 5120 E1 5120),(MUL_ADD 40 E1 40),40,1) where E0=(- (MUL_ADD 16 (CEIL_DIV (+ height -15) 16) 16) height) E1=(CEIL_DIV (+ (+ (MUL_ADD -16 (CEIL_DIV (+ E0 -15) 16) E0) height) -15) 8) where E0=(- (MUL_ADD 16 (CEIL_DIV (+ height -15) 16) 16) height) E1=(CAST_F32_TO_I (FLOOR (MUL_ADD_F 2 (CAST_I_TO_F32 (MUL_ADD 4 (CEIL_DIV (MUL_ADD 7 (CEIL_DIV (+ E2 -7) 8) -27) 14) 4)) 0))) E2=(+ (MUL_ADD -16 (CEIL_DIV (+ E0 -15) 16) E0) height) E3=(CEIL_DIV (+ E2 -15) 8) ***************
[11/11/2025-11:33:53] [E] Error[2]: [convBaseBuilder.cpp::nvinfer1::builder::CaskConvBaseBuilder<class nvinfer1::rt::task::CaskConvolutionRunner,-2147483639>::createConvolution::259] Error Code 2: Internal Error (Assertion isOpConsistent(convolution.get()) failed. Cask convolution isConsistent check failed.)
------------------------------------------------------------------------------------------------------------------------
于是我是用polygraph进行了多轮尝试
```python
(bridgedepth) E:\AI\Code\WorkCode\RT-Stereo\onnx>polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --onnxrt --input-shapes left_image:[1,3,320,320] right_image:[1,3,320,320]
[I] RUNNING | Command: D:\ProgramData\anaconda3\envs\bridgedepth\Scripts\polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --onnxrt --input-shapes left_image:[1,3,320,320] right_image:[1,3,320,320]
[I] Will generate inference input data according to provided TensorMetadata: {left_image [shape=(1, 3, 320, 320)],
     right_image [shape=(1, 3, 320, 320)]}
[I] onnxrt-runner-N0-11/11/25-14:48:49  | Activating and starting inference
[I] Creating ONNX-Runtime Inference Session with providers: ['CPUExecutionProvider']
[I] onnxrt-runner-N0-11/11/25-14:48:49
    ---- Inference Input(s) ----
    {left_image [dtype=float32, shape=(1, 3, 320, 320)],
     right_image [dtype=float32, shape=(1, 3, 320, 320)]}
[I] onnxrt-runner-N0-11/11/25-14:48:49
    ---- Inference Output(s) ----
    {disparity [dtype=float32, shape=(1, 320, 320)]}
[I] onnxrt-runner-N0-11/11/25-14:48:49  | Completed 1 iteration(s) in 4672 ms | Average inference time: 4672 ms.
[I] PASSED | Runtime: 17.783s | Command: D:\ProgramData\anaconda3\envs\bridgedepth\Scripts\polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --onnxrt --input-shapes left_image:[1,3,320,320] right_image:[1,3,320,320]

(bridgedepth) E:\AI\Code\WorkCode\RT-Stereo\onnx>polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --onnxrt --input-shapes left_image:[1,3,256,320] right_image:[1,3,256,320]
[I] RUNNING | Command: D:\ProgramData\anaconda3\envs\bridgedepth\Scripts\polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --onnxrt --input-shapes left_image:[1,3,256,320] right_image:[1,3,256,320]
[I] Will generate inference input data according to provided TensorMetadata: {left_image [shape=(1, 3, 256, 320)],
     right_image [shape=(1, 3, 256, 320)]}
[I] onnxrt-runner-N0-11/11/25-14:49:23  | Activating and starting inference
[I] Creating ONNX-Runtime Inference Session with providers: ['CPUExecutionProvider']
[I] onnxrt-runner-N0-11/11/25-14:49:23
    ---- Inference Input(s) ----
    {left_image [dtype=float32, shape=(1, 3, 256, 320)],
     right_image [dtype=float32, shape=(1, 3, 256, 320)]}
[I] onnxrt-runner-N0-11/11/25-14:49:23
    ---- Inference Output(s) ----
    {disparity [dtype=float32, shape=(1, 256, 320)]}
[I] onnxrt-runner-N0-11/11/25-14:49:23  | Completed 1 iteration(s) in 3535 ms | Average inference time: 3535 ms.
[I] PASSED | Runtime: 15.792s | Command: D:\ProgramData\anaconda3\envs\bridgedepth\Scripts\polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --onnxrt --input-shapes left_image:[1,3,256,320] right_image:[1,3,256,320]

(bridgedepth) E:\AI\Code\WorkCode\RT-Stereo\onnx>polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --onnxrt --input-shapes left_image:[1,3,128,320] right_image:[1,3,128,320]
[I] RUNNING | Command: D:\ProgramData\anaconda3\envs\bridgedepth\Scripts\polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --onnxrt --input-shapes left_image:[1,3,128,320] right_image:[1,3,128,320]
[I] Will generate inference input data according to provided TensorMetadata: {left_image [shape=(1, 3, 128, 320)],
     right_image [shape=(1, 3, 128, 320)]}
[I] onnxrt-runner-N0-11/11/25-14:49:52  | Activating and starting inference
[I] Creating ONNX-Runtime Inference Session with providers: ['CPUExecutionProvider']
[I] onnxrt-runner-N0-11/11/25-14:49:52
    ---- Inference Input(s) ----
    {left_image [dtype=float32, shape=(1, 3, 128, 320)],
     right_image [dtype=float32, shape=(1, 3, 128, 320)]}
[I] onnxrt-runner-N0-11/11/25-14:49:52
    ---- Inference Output(s) ----
    {disparity [dtype=float32, shape=(1, 128, 320)]}
[I] onnxrt-runner-N0-11/11/25-14:49:52  | Completed 1 iteration(s) in 1923 ms | Average inference time: 1923 ms.
[I] PASSED | Runtime: 14.707s | Command: D:\ProgramData\anaconda3\envs\bridgedepth\Scripts\polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --onnxrt --input-shapes left_image:[1,3,128,320] right_image:[1,3,128,320]

(bridgedepth) E:\AI\Code\WorkCode\RT-Stereo\onnx>polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --onnxrt --input-shapes left_image:[1,3,32,320] right_image:[1,3,32,320]
[I] RUNNING | Command: D:\ProgramData\anaconda3\envs\bridgedepth\Scripts\polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --onnxrt --input-shapes left_image:[1,3,32,320] right_image:[1,3,32,320]
[I] Will generate inference input data according to provided TensorMetadata: {left_image [shape=(1, 3, 32, 320)],
     right_image [shape=(1, 3, 32, 320)]}
[I] onnxrt-runner-N0-11/11/25-14:50:17  | Activating and starting inference
[I] Creating ONNX-Runtime Inference Session with providers: ['CPUExecutionProvider']
[I] onnxrt-runner-N0-11/11/25-14:50:17
    ---- Inference Input(s) ----
    {left_image [dtype=float32, shape=(1, 3, 32, 320)],
     right_image [dtype=float32, shape=(1, 3, 32, 320)]}
[I] onnxrt-runner-N0-11/11/25-14:50:17
    ---- Inference Output(s) ----
    {disparity [dtype=float32, shape=(1, 32, 320)]}
[I] onnxrt-runner-N0-11/11/25-14:50:17  | Completed 1 iteration(s) in 720.6 ms | Average inference time: 720.6 ms.
[I] PASSED | Runtime: 13.987s | Command: D:\ProgramData\anaconda3\envs\bridgedepth\Scripts\polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --onnxrt --input-shapes left_image:[1,3,32,320] right_image:[1,3,32,320]

(bridgedepth) E:\AI\Code\WorkCode\RT-Stereo\onnx>polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --trt --input-shapes left_image:[1,3,320,320] right_image:[1,3,320,320]
[I] RUNNING | Command: D:\ProgramData\anaconda3\envs\bridgedepth\Scripts\polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --trt --input-shapes left_image:[1,3,320,320] right_image:[1,3,320,320]
[I] Will generate inference input data according to provided TensorMetadata: {left_image [shape=(1, 3, 320, 320)],
     right_image [shape=(1, 3, 320, 320)]}
[I] TF32 is disabled by default. Turn on TF32 for better performance with minor accuracy differences.
[I] trt-runner-N0-11/11/25-14:54:16     | Activating and starting inference
[libprotobuf WARNING **************************************************************************\externals\protobuf\3.0.0\src\google\protobuf\io\coded_stream.cc:604] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING **************************************************************************\externals\protobuf\3.0.0\src\google\protobuf\io\coded_stream.cc:81] The total number of bytes read was 882710710
[I] Configuring with profiles:[
        Profile 0:
            {left_image [min=[1, 3, 320, 320], opt=[1, 3, 320, 320], max=[1, 3, 320, 320]],
             right_image [min=[1, 3, 320, 320], opt=[1, 3, 320, 320], max=[1, 3, 320, 320]]}
    ]
[W] profileSharing0806 is on by default in TensorRT 10.0. This flag is deprecated and has no effect.
[I] Building engine with configuration:
    Flags                  | []
    Engine Capability      | EngineCapability.STANDARD
    Memory Pools           | [WORKSPACE: 8187.50 MiB, TACTIC_DRAM: 8187.50 MiB, TACTIC_SHARED_MEMORY: 1024.00 MiB]
    Tactic Sources         | [EDGE_MASK_CONVOLUTIONS, JIT_CONVOLUTIONS]
    Profiling Verbosity    | ProfilingVerbosity.DETAILED
    Preview Features       | [PROFILE_SHARING_0806]
[I] Finished engine building in 114.424 seconds
[I] trt-runner-N0-11/11/25-14:54:16
    ---- Inference Input(s) ----
    {left_image [dtype=float32, shape=(1, 3, 320, 320)],
     right_image [dtype=float32, shape=(1, 3, 320, 320)]}
[I] trt-runner-N0-11/11/25-14:54:16
    ---- Inference Output(s) ----
    {disparity [dtype=float32, shape=(1, 320, 320)]}
[I] trt-runner-N0-11/11/25-14:54:16     | Completed 1 iteration(s) in 1508 ms | Average inference time: 1508 ms.
[I] PASSED | Runtime: 127.165s | Command: D:\ProgramData\anaconda3\envs\bridgedepth\Scripts\polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --trt --input-shapes left_image:[1,3,320,320] right_image:[1,3,320,320]

(bridgedepth) E:\AI\Code\WorkCode\RT-Stereo\onnx>polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --trt --input-shapes left_image:[1,3,256,320] right_image:[1,3,256,320]
[I] RUNNING | Command: D:\ProgramData\anaconda3\envs\bridgedepth\Scripts\polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --trt --input-shapes left_image:[1,3,256,320] right_image:[1,3,256,320]
[I] Will generate inference input data according to provided TensorMetadata: {left_image [shape=(1, 3, 256, 320)],
     right_image [shape=(1, 3, 256, 320)]}
[I] TF32 is disabled by default. Turn on TF32 for better performance with minor accuracy differences.
[I] trt-runner-N0-11/11/25-14:57:02     | Activating and starting inference
[libprotobuf WARNING **************************************************************************\externals\protobuf\3.0.0\src\google\protobuf\io\coded_stream.cc:604] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING **************************************************************************\externals\protobuf\3.0.0\src\google\protobuf\io\coded_stream.cc:81] The total number of bytes read was 882710710
[I] Configuring with profiles:[
        Profile 0:
            {left_image [min=[1, 3, 256, 320], opt=[1, 3, 256, 320], max=[1, 3, 256, 320]],
             right_image [min=[1, 3, 256, 320], opt=[1, 3, 256, 320], max=[1, 3, 256, 320]]}
    ]
[W] profileSharing0806 is on by default in TensorRT 10.0. This flag is deprecated and has no effect.
[I] Building engine with configuration:
    Flags                  | []
    Engine Capability      | EngineCapability.STANDARD
    Memory Pools           | [WORKSPACE: 8187.50 MiB, TACTIC_DRAM: 8187.50 MiB, TACTIC_SHARED_MEMORY: 1024.00 MiB]
    Tactic Sources         | [EDGE_MASK_CONVOLUTIONS, JIT_CONVOLUTIONS]
    Profiling Verbosity    | ProfilingVerbosity.DETAILED
    Preview Features       | [PROFILE_SHARING_0806]
[I] Finished engine building in 135.344 seconds
[I] trt-runner-N0-11/11/25-14:57:02
    ---- Inference Input(s) ----
    {left_image [dtype=float32, shape=(1, 3, 256, 320)],
     right_image [dtype=float32, shape=(1, 3, 256, 320)]}
[I] trt-runner-N0-11/11/25-14:57:02
    ---- Inference Output(s) ----
    {disparity [dtype=float32, shape=(1, 256, 320)]}
[I] trt-runner-N0-11/11/25-14:57:02     | Completed 1 iteration(s) in 1436 ms | Average inference time: 1436 ms.
[I] PASSED | Runtime: 147.875s | Command: D:\ProgramData\anaconda3\envs\bridgedepth\Scripts\polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --trt --input-shapes left_image:[1,3,256,320] right_image:[1,3,256,320]

(bridgedepth) E:\AI\Code\WorkCode\RT-Stereo\onnx>polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --trt --input-shapes left_image:[1,3,128,320] right_image:[1,3,128,320]
[I] RUNNING | Command: D:\ProgramData\anaconda3\envs\bridgedepth\Scripts\polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --trt --input-shapes left_image:[1,3,128,320] right_image:[1,3,128,320]
[I] Will generate inference input data according to provided TensorMetadata: {left_image [shape=(1, 3, 128, 320)],
     right_image [shape=(1, 3, 128, 320)]}
[I] TF32 is disabled by default. Turn on TF32 for better performance with minor accuracy differences.
[I] trt-runner-N0-11/11/25-15:00:04     | Activating and starting inference
[libprotobuf WARNING **************************************************************************\externals\protobuf\3.0.0\src\google\protobuf\io\coded_stream.cc:604] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING **************************************************************************\externals\protobuf\3.0.0\src\google\protobuf\io\coded_stream.cc:81] The total number of bytes read was 882710710
[I] Configuring with profiles:[
        Profile 0:
            {left_image [min=[1, 3, 128, 320], opt=[1, 3, 128, 320], max=[1, 3, 128, 320]],
             right_image [min=[1, 3, 128, 320], opt=[1, 3, 128, 320], max=[1, 3, 128, 320]]}
    ]
[W] profileSharing0806 is on by default in TensorRT 10.0. This flag is deprecated and has no effect.
[I] Building engine with configuration:
    Flags                  | []
    Engine Capability      | EngineCapability.STANDARD
    Memory Pools           | [WORKSPACE: 8187.50 MiB, TACTIC_DRAM: 8187.50 MiB, TACTIC_SHARED_MEMORY: 1024.00 MiB]
    Tactic Sources         | [EDGE_MASK_CONVOLUTIONS, JIT_CONVOLUTIONS]
    Profiling Verbosity    | ProfilingVerbosity.DETAILED
    Preview Features       | [PROFILE_SHARING_0806]
[I] Finished engine building in 116.202 seconds
[I] trt-runner-N0-11/11/25-15:00:04
    ---- Inference Input(s) ----
    {left_image [dtype=float32, shape=(1, 3, 128, 320)],
     right_image [dtype=float32, shape=(1, 3, 128, 320)]}
[I] trt-runner-N0-11/11/25-15:00:04
    ---- Inference Output(s) ----
    {disparity [dtype=float32, shape=(1, 128, 320)]}
[I] trt-runner-N0-11/11/25-15:00:04     | Completed 1 iteration(s) in 1450 ms | Average inference time: 1450 ms.
[I] PASSED | Runtime: 128.698s | Command: D:\ProgramData\anaconda3\envs\bridgedepth\Scripts\polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --trt --input-shapes left_image:[1,3,128,320] right_image:[1,3,128,320]

(bridgedepth) E:\AI\Code\WorkCode\RT-Stereo\onnx>polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --trt --input-shapes left_image:[1,3,32,320] right_image:[1,3,32,320]
[I] RUNNING | Command: D:\ProgramData\anaconda3\envs\bridgedepth\Scripts\polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --trt --input-shapes left_image:[1,3,32,320] right_image:[1,3,32,320]
[I] Will generate inference input data according to provided TensorMetadata: {left_image [shape=(1, 3, 32, 320)],
     right_image [shape=(1, 3, 32, 320)]}
[I] TF32 is disabled by default. Turn on TF32 for better performance with minor accuracy differences.
[I] trt-runner-N0-11/11/25-15:02:42     | Activating and starting inference
[libprotobuf WARNING **************************************************************************\externals\protobuf\3.0.0\src\google\protobuf\io\coded_stream.cc:604] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING **************************************************************************\externals\protobuf\3.0.0\src\google\protobuf\io\coded_stream.cc:81] The total number of bytes read was 882710710
[I] Configuring with profiles:[
        Profile 0:
            {left_image [min=[1, 3, 32, 320], opt=[1, 3, 32, 320], max=[1, 3, 32, 320]],
             right_image [min=[1, 3, 32, 320], opt=[1, 3, 32, 320], max=[1, 3, 32, 320]]}
    ]
[W] profileSharing0806 is on by default in TensorRT 10.0. This flag is deprecated and has no effect.
[I] Building engine with configuration:
    Flags                  | []
    Engine Capability      | EngineCapability.STANDARD
    Memory Pools           | [WORKSPACE: 8187.50 MiB, TACTIC_DRAM: 8187.50 MiB, TACTIC_SHARED_MEMORY: 1024.00 MiB]
    Tactic Sources         | [EDGE_MASK_CONVOLUTIONS, JIT_CONVOLUTIONS]
    Profiling Verbosity    | ProfilingVerbosity.DETAILED
    Preview Features       | [PROFILE_SHARING_0806]
[I] Finished engine building in 123.805 seconds
[I] trt-runner-N0-11/11/25-15:02:42
    ---- Inference Input(s) ----
    {left_image [dtype=float32, shape=(1, 3, 32, 320)],
     right_image [dtype=float32, shape=(1, 3, 32, 320)]}
[I] trt-runner-N0-11/11/25-15:02:42
    ---- Inference Output(s) ----
    {disparity [dtype=float32, shape=(1, 32, 320)]}
[I] trt-runner-N0-11/11/25-15:02:42     | Completed 1 iteration(s) in 1328 ms | Average inference time: 1328 ms.
[I] PASSED | Runtime: 136.176s | Command: D:\ProgramData\anaconda3\envs\bridgedepth\Scripts\polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --trt --input-shapes left_image:[1,3,32,320] right_image:[1,3,32,320]
``` 
-------------------------------------------------------------------------------------------------------------------
所有的静态都通过了，到动态就报错了
``` python
(bridgedepth) E:\AI\Code\WorkCode\RT-Stereo\onnx>polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --trt --trt-min-shapes left_image:[1,3,32,320] right_image:[1,3,32,320] --trt-opt-shapes left_image:[1,3,320,320] right_image:[1,3,320,320] --trt-max-shapes left_image:[1,3,480,640] right_image:[1,3,480,640]
[I] RUNNING | Command: D:\ProgramData\anaconda3\envs\bridgedepth\Scripts\polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --trt --trt-min-shapes left_image:[1,3,32,320] right_image:[1,3,32,320] --trt-opt-shapes left_image:[1,3,320,320] right_image:[1,3,320,320] --trt-max-shapes left_image:[1,3,480,640] right_image:[1,3,480,640]
[I] TF32 is disabled by default. Turn on TF32 for better performance with minor accuracy differences.
[I] trt-runner-N0-11/11/25-15:09:30     | Activating and starting inference
[libprotobuf WARNING **************************************************************************\externals\protobuf\3.0.0\src\google\protobuf\io\coded_stream.cc:604] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING **************************************************************************\externals\protobuf\3.0.0\src\google\protobuf\io\coded_stream.cc:81] The total number of bytes read was 882710710
[I] Configuring with profiles:[
        Profile 0:
            {left_image [min=[1, 3, 32, 320], opt=[1, 3, 320, 320], max=[1, 3, 480, 640]],
             right_image [min=[1, 3, 32, 320], opt=[1, 3, 320, 320], max=[1, 3, 480, 640]]}
    ]
[W] profileSharing0806 is on by default in TensorRT 10.0. This flag is deprecated and has no effect.
[I] Building engine with configuration:
    Flags                  | []
    Engine Capability      | EngineCapability.STANDARD
    Memory Pools           | [WORKSPACE: 8187.50 MiB, TACTIC_DRAM: 8187.50 MiB, TACTIC_SHARED_MEMORY: 1024.00 MiB]
    Tactic Sources         | [EDGE_MASK_CONVOLUTIONS, JIT_CONVOLUTIONS]
    Profiling Verbosity    | ProfilingVerbosity.DETAILED
    Preview Features       | [PROFILE_SHARING_0806]
[E] [convBaseBuilder.cpp::nvinfer1::builder::CaskConvBaseBuilder<class nvinfer1::rt::task::CaskConvolutionRunner,-2147483639>::createConvolution::259] Error Code 2: Internal Error (Assertion isOpConsistent(convolution.get()) failed. Cask convolution isConsistent check failed.)
[!] Invalid Engine. Please ensure the engine was built correctly
[E] FAILED | Runtime: 108.198s | Command: D:\ProgramData\anaconda3\envs\bridgedepth\Scripts\polygraphy run E:\AI\Code\WorkCode\RT-Stereo\onnx\bridgedepth_eth3d_opset17_320_320_dynamic.onnx --trt --trt-min-shapes left_image:[1,3,32,320] right_image:[1,3,32,320] --trt-opt-shapes left_image:[1,3,320,320] right_image:[1,3,320,320] --trt-max-shapes left_image:[1,3,480,640] right_image:[1,3,480,640]
``` 

我使用netron定位/bridge_model/depth_head/output_conv1/Conv，找到了这一行代码
self.depth_feature_down = nn.Conv2d(depth_feature_dim, dim, kernel_size=4, stride=4, padding=0)
我不知道应该如何解决



## Environment



**TensorRT Version**:10.3 and 10.13

**NVIDIA GPU**:RTX 2000 Ada

**NVIDIA Driver Version**:

**CUDA Version**:11.8

**CUDNN Version**:


Operating System:

Python Version (if applicable):3.10.18

Tensorflow Version (if applicable):

PyTorch Version (if applicable):2.7.0+cu118

Baremetal or Container (if so, version):


## Relevant Files



**Model link**:


## Steps To Reproduce



**Commands or scripts**:

**Have you tried [the latest release](https://developer.nvidia.com/tensorrt)?**:

**Attach the captured .json and .bin files from [TensorRT's API Capture tool](https://docs.nvidia.com/deeplearning/tensorrt/latest/inference-library/capture-replay.html) if you're on an x86_64 Unix system**

**Can this model run on other frameworks?** For example run ONNX model with ONNXRuntime (`polygraphy run <model.onnx> --onnxrt`):

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cask convolution isConsistent check failed #4641

Description

Environment

Relevant Files

Steps To Reproduce

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Cask convolution isConsistent check failed #4641

Description

Description

Environment

Relevant Files

Steps To Reproduce

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions