onnx to engine failure of TensorRT 8.5.1 when running trtexec --onnx=sam_vit_b_prompt_mask.onnx --saveEngine=sam_vit_b_prompt_mask.engine --workspace=4096 on GPU Tesla T4

## Description


The following is all the error information：
root@2bfd78f50ff8:/mnt/HD1/base_onnx# trtexec --onnx=sam_vit_b_prompt_mask.onnx --saveEngine=sam_vit_b_prompt_mask.engine --workspace=4096
&&&& RUNNING TensorRT.trtexec [TensorRT v8501] # trtexec --onnx=sam_vit_b_prompt_mask.onnx --saveEngine=sam_vit_b_prompt_mask.engine --workspace=4096
[07/02/2025-09:09:25] [W] --workspace flag has been deprecated by --memPoolSize flag.
[07/02/2025-09:09:25] [I] === Model Options ===
[07/02/2025-09:09:25] [I] Format: ONNX
[07/02/2025-09:09:25] [I] Model: sam_vit_b_prompt_mask.onnx
[07/02/2025-09:09:25] [I] Output:
[07/02/2025-09:09:25] [I] === Build Options ===
[07/02/2025-09:09:25] [I] Max batch: explicit batch
[07/02/2025-09:09:25] [I] Memory Pools: workspace: 4096 MiB, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[07/02/2025-09:09:25] [I] minTiming: 1
[07/02/2025-09:09:25] [I] avgTiming: 8
[07/02/2025-09:09:25] [I] Precision: FP32
[07/02/2025-09:09:25] [I] LayerPrecisions: 
[07/02/2025-09:09:25] [I] Calibration: 
[07/02/2025-09:09:25] [I] Refit: Disabled
[07/02/2025-09:09:25] [I] Sparsity: Disabled
[07/02/2025-09:09:25] [I] Safe mode: Disabled
[07/02/2025-09:09:25] [I] DirectIO mode: Disabled
[07/02/2025-09:09:25] [I] Restricted mode: Disabled
[07/02/2025-09:09:25] [I] Build only: Disabled
[07/02/2025-09:09:25] [I] Save engine: sam_vit_b_prompt_mask.engine
[07/02/2025-09:09:25] [I] Load engine: 
[07/02/2025-09:09:25] [I] Profiling verbosity: 0
[07/02/2025-09:09:25] [I] Tactic sources: Using default tactic sources
[07/02/2025-09:09:25] [I] timingCacheMode: local
[07/02/2025-09:09:25] [I] timingCacheFile: 
[07/02/2025-09:09:25] [I] Heuristic: Disabled
[07/02/2025-09:09:25] [I] Preview Features: Use default preview flags.
[07/02/2025-09:09:25] [I] Input(s)s format: fp32:CHW
[07/02/2025-09:09:25] [I] Output(s)s format: fp32:CHW
[07/02/2025-09:09:25] [I] Input build shapes: model
[07/02/2025-09:09:25] [I] Input calibration shapes: model
[07/02/2025-09:09:25] [I] === System Options ===
[07/02/2025-09:09:25] [I] Device: 0
[07/02/2025-09:09:25] [I] DLACore: 
[07/02/2025-09:09:25] [I] Plugins:
[07/02/2025-09:09:25] [I] === Inference Options ===
[07/02/2025-09:09:25] [I] Batch: Explicit
[07/02/2025-09:09:25] [I] Input inference shapes: model
[07/02/2025-09:09:25] [I] Iterations: 10
[07/02/2025-09:09:25] [I] Duration: 3s (+ 200ms warm up)
[07/02/2025-09:09:25] [I] Sleep time: 0ms
[07/02/2025-09:09:25] [I] Idle time: 0ms
[07/02/2025-09:09:25] [I] Streams: 1
[07/02/2025-09:09:25] [I] ExposeDMA: Disabled
[07/02/2025-09:09:25] [I] Data transfers: Enabled
[07/02/2025-09:09:25] [I] Spin-wait: Disabled
[07/02/2025-09:09:25] [I] Multithreading: Disabled
[07/02/2025-09:09:25] [I] CUDA Graph: Disabled
[07/02/2025-09:09:25] [I] Separate profiling: Disabled
[07/02/2025-09:09:25] [I] Time Deserialize: Disabled
[07/02/2025-09:09:25] [I] Time Refit: Disabled
[07/02/2025-09:09:25] [I] NVTX verbosity: 0
[07/02/2025-09:09:25] [I] Persistent Cache Ratio: 0
[07/02/2025-09:09:25] [I] Inputs:
[07/02/2025-09:09:25] [I] === Reporting Options ===
[07/02/2025-09:09:25] [I] Verbose: Disabled
[07/02/2025-09:09:25] [I] Averages: 10 inferences
[07/02/2025-09:09:25] [I] Percentiles: 90,95,99
[07/02/2025-09:09:25] [I] Dump refittable layers:Disabled
[07/02/2025-09:09:25] [I] Dump output: Disabled
[07/02/2025-09:09:25] [I] Profile: Disabled
[07/02/2025-09:09:25] [I] Export timing to JSON file: 
[07/02/2025-09:09:25] [I] Export output to JSON file: 
[07/02/2025-09:09:25] [I] Export profile to JSON file: 
[07/02/2025-09:09:25] [I] 
[07/02/2025-09:09:25] [I] === Device Information ===
[07/02/2025-09:09:25] [I] Selected Device: Tesla T4
[07/02/2025-09:09:25] [I] Compute Capability: 7.5
[07/02/2025-09:09:25] [I] SMs: 40
[07/02/2025-09:09:25] [I] Compute Clock Rate: 1.59 GHz
[07/02/2025-09:09:25] [I] Device Global Memory: 14960 MiB
[07/02/2025-09:09:25] [I] Shared Memory per SM: 64 KiB
[07/02/2025-09:09:25] [I] Memory Bus Width: 256 bits (ECC enabled)
[07/02/2025-09:09:25] [I] Memory Clock Rate: 5.001 GHz
[07/02/2025-09:09:25] [I] 
[07/02/2025-09:09:25] [I] TensorRT version: 8.5.1
[07/02/2025-09:09:26] [I] [TRT] [MemUsageChange] Init CUDA: CPU +305, GPU +0, now: CPU 318, GPU 239 (MiB)
[07/02/2025-09:09:28] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +262, GPU +74, now: CPU 635, GPU 313 (MiB)
[07/02/2025-09:09:28] [W] [TRT] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[07/02/2025-09:09:28] [I] Start parsing network model
[07/02/2025-09:09:28] [I] [TRT] ----------------------------------------------------------------
[07/02/2025-09:09:28] [I] [TRT] Input filename:   sam_vit_b_prompt_mask.onnx
[07/02/2025-09:09:28] [I] [TRT] ONNX IR version:  0.0.7
[07/02/2025-09:09:28] [I] [TRT] Opset version:    14
[07/02/2025-09:09:28] [I] [TRT] Producer name:    pytorch
[07/02/2025-09:09:28] [I] [TRT] Producer version: 2.7.1
[07/02/2025-09:09:28] [I] [TRT] Domain:           
[07/02/2025-09:09:28] [I] [TRT] Model version:    0
[07/02/2025-09:09:28] [I] [TRT] Doc string:       
[07/02/2025-09:09:28] [I] [TRT] ----------------------------------------------------------------
[07/02/2025-09:09:28] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[07/02/2025-09:09:28] [I] [TRT] /pe_layer/MatMul: broadcasting input1 to make tensors conform, dims(input0)=[64,64,2][NONE] dims(input1)=[1,2,128][NONE].
[07/02/2025-09:09:28] [I] [TRT] /pe_layer/MatMul: broadcasting input1 to make tensors conform, dims(input0)=[64,64,2][NONE] dims(input1)=[1,2,128][NONE].
[07/02/2025-09:09:28] [I] [TRT] /pe_layer/MatMul: broadcasting input1 to make tensors conform, dims(input0)=[64,64,2][NONE] dims(input1)=[1,2,128][NONE].
[07/02/2025-09:09:28] [I] [TRT] /pe_layer/MatMul: broadcasting input1 to make tensors conform, dims(input0)=[64,64,2][NONE] dims(input1)=[1,2,128][NONE].
[07/02/2025-09:09:28] [E] Error[4]: [graph.cpp::symbolicExecute::611] Error Code 4: Internal Error (/OneHot: an IIOneHotLayer cannot be used to compute a shape tensor)
[07/02/2025-09:09:28] [E] [TRT] parsers/onnx/ModelImporter.cpp:726: While parsing node number 146 [Tile -> "/Tile_output_0"]:
[07/02/2025-09:09:28] [E] [TRT] parsers/onnx/ModelImporter.cpp:727: --- Begin node ---
[07/02/2025-09:09:28] [E] [TRT] parsers/onnx/ModelImporter.cpp:728: input: "/Unsqueeze_3_output_0"
input: "/Reshape_2_output_0"
output: "/Tile_output_0"
name: "/Tile"
op_type: "Tile"

[07/02/2025-09:09:28] [E] [TRT] parsers/onnx/ModelImporter.cpp:729: --- End node ---
[07/02/2025-09:09:28] [E] [TRT] parsers/onnx/ModelImporter.cpp:731: ERROR: parsers/onnx/ModelImporter.cpp:185 In function parseGraph:
[6] Invalid Node - /Tile
[graph.cpp::symbolicExecute::611] Error Code 4: Internal Error (/OneHot: an IIOneHotLayer cannot be used to compute a shape tensor)
[07/02/2025-09:09:28] [E] Failed to parse onnx file
[07/02/2025-09:09:28] [I] Finish parsing network model
[07/02/2025-09:09:28] [E] Parsing model failed
[07/02/2025-09:09:28] [E] Failed to create engine from model or file.
[07/02/2025-09:09:28] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8501] # trtexec --onnx=sam_vit_b_prompt_mask.onnx --saveEngine=sam_vit_b_prompt_mask.engine --workspace=4096

I hope you can help me explain what this is and how to solve it, thank you

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

onnx to engine failure of TensorRT 8.5.1 when running trtexec --onnx=sam_vit_b_prompt_mask.onnx --saveEngine=sam_vit_b_prompt_mask.engine --workspace=4096 on GPU Tesla T4 #4506

Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

onnx to engine failure of TensorRT 8.5.1 when running trtexec --onnx=sam_vit_b_prompt_mask.onnx --saveEngine=sam_vit_b_prompt_mask.engine --workspace=4096 on GPU Tesla T4 #4506

Description

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions