Skip to content

onnx to engine failure of TensorRT 8.5.1 when running trtexec --onnx=sam_vit_b_prompt_mask.onnx --saveEngine=sam_vit_b_prompt_mask.engine --workspace=4096 on GPU Tesla T4 #4506

@hooldylan

Description

@hooldylan

Description

The following is all the error information:
root@2bfd78f50ff8:/mnt/HD1/base_onnx# trtexec --onnx=sam_vit_b_prompt_mask.onnx --saveEngine=sam_vit_b_prompt_mask.engine --workspace=4096
&&&& RUNNING TensorRT.trtexec [TensorRT v8501] # trtexec --onnx=sam_vit_b_prompt_mask.onnx --saveEngine=sam_vit_b_prompt_mask.engine --workspace=4096
[07/02/2025-09:09:25] [W] --workspace flag has been deprecated by --memPoolSize flag.
[07/02/2025-09:09:25] [I] === Model Options ===
[07/02/2025-09:09:25] [I] Format: ONNX
[07/02/2025-09:09:25] [I] Model: sam_vit_b_prompt_mask.onnx
[07/02/2025-09:09:25] [I] Output:
[07/02/2025-09:09:25] [I] === Build Options ===
[07/02/2025-09:09:25] [I] Max batch: explicit batch
[07/02/2025-09:09:25] [I] Memory Pools: workspace: 4096 MiB, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[07/02/2025-09:09:25] [I] minTiming: 1
[07/02/2025-09:09:25] [I] avgTiming: 8
[07/02/2025-09:09:25] [I] Precision: FP32
[07/02/2025-09:09:25] [I] LayerPrecisions:
[07/02/2025-09:09:25] [I] Calibration:
[07/02/2025-09:09:25] [I] Refit: Disabled
[07/02/2025-09:09:25] [I] Sparsity: Disabled
[07/02/2025-09:09:25] [I] Safe mode: Disabled
[07/02/2025-09:09:25] [I] DirectIO mode: Disabled
[07/02/2025-09:09:25] [I] Restricted mode: Disabled
[07/02/2025-09:09:25] [I] Build only: Disabled
[07/02/2025-09:09:25] [I] Save engine: sam_vit_b_prompt_mask.engine
[07/02/2025-09:09:25] [I] Load engine:
[07/02/2025-09:09:25] [I] Profiling verbosity: 0
[07/02/2025-09:09:25] [I] Tactic sources: Using default tactic sources
[07/02/2025-09:09:25] [I] timingCacheMode: local
[07/02/2025-09:09:25] [I] timingCacheFile:
[07/02/2025-09:09:25] [I] Heuristic: Disabled
[07/02/2025-09:09:25] [I] Preview Features: Use default preview flags.
[07/02/2025-09:09:25] [I] Input(s)s format: fp32:CHW
[07/02/2025-09:09:25] [I] Output(s)s format: fp32:CHW
[07/02/2025-09:09:25] [I] Input build shapes: model
[07/02/2025-09:09:25] [I] Input calibration shapes: model
[07/02/2025-09:09:25] [I] === System Options ===
[07/02/2025-09:09:25] [I] Device: 0
[07/02/2025-09:09:25] [I] DLACore:
[07/02/2025-09:09:25] [I] Plugins:
[07/02/2025-09:09:25] [I] === Inference Options ===
[07/02/2025-09:09:25] [I] Batch: Explicit
[07/02/2025-09:09:25] [I] Input inference shapes: model
[07/02/2025-09:09:25] [I] Iterations: 10
[07/02/2025-09:09:25] [I] Duration: 3s (+ 200ms warm up)
[07/02/2025-09:09:25] [I] Sleep time: 0ms
[07/02/2025-09:09:25] [I] Idle time: 0ms
[07/02/2025-09:09:25] [I] Streams: 1
[07/02/2025-09:09:25] [I] ExposeDMA: Disabled
[07/02/2025-09:09:25] [I] Data transfers: Enabled
[07/02/2025-09:09:25] [I] Spin-wait: Disabled
[07/02/2025-09:09:25] [I] Multithreading: Disabled
[07/02/2025-09:09:25] [I] CUDA Graph: Disabled
[07/02/2025-09:09:25] [I] Separate profiling: Disabled
[07/02/2025-09:09:25] [I] Time Deserialize: Disabled
[07/02/2025-09:09:25] [I] Time Refit: Disabled
[07/02/2025-09:09:25] [I] NVTX verbosity: 0
[07/02/2025-09:09:25] [I] Persistent Cache Ratio: 0
[07/02/2025-09:09:25] [I] Inputs:
[07/02/2025-09:09:25] [I] === Reporting Options ===
[07/02/2025-09:09:25] [I] Verbose: Disabled
[07/02/2025-09:09:25] [I] Averages: 10 inferences
[07/02/2025-09:09:25] [I] Percentiles: 90,95,99
[07/02/2025-09:09:25] [I] Dump refittable layers:Disabled
[07/02/2025-09:09:25] [I] Dump output: Disabled
[07/02/2025-09:09:25] [I] Profile: Disabled
[07/02/2025-09:09:25] [I] Export timing to JSON file:
[07/02/2025-09:09:25] [I] Export output to JSON file:
[07/02/2025-09:09:25] [I] Export profile to JSON file:
[07/02/2025-09:09:25] [I]
[07/02/2025-09:09:25] [I] === Device Information ===
[07/02/2025-09:09:25] [I] Selected Device: Tesla T4
[07/02/2025-09:09:25] [I] Compute Capability: 7.5
[07/02/2025-09:09:25] [I] SMs: 40
[07/02/2025-09:09:25] [I] Compute Clock Rate: 1.59 GHz
[07/02/2025-09:09:25] [I] Device Global Memory: 14960 MiB
[07/02/2025-09:09:25] [I] Shared Memory per SM: 64 KiB
[07/02/2025-09:09:25] [I] Memory Bus Width: 256 bits (ECC enabled)
[07/02/2025-09:09:25] [I] Memory Clock Rate: 5.001 GHz
[07/02/2025-09:09:25] [I]
[07/02/2025-09:09:25] [I] TensorRT version: 8.5.1
[07/02/2025-09:09:26] [I] [TRT] [MemUsageChange] Init CUDA: CPU +305, GPU +0, now: CPU 318, GPU 239 (MiB)
[07/02/2025-09:09:28] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +262, GPU +74, now: CPU 635, GPU 313 (MiB)
[07/02/2025-09:09:28] [W] [TRT] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See CUDA_MODULE_LOADING in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[07/02/2025-09:09:28] [I] Start parsing network model
[07/02/2025-09:09:28] [I] [TRT] ----------------------------------------------------------------
[07/02/2025-09:09:28] [I] [TRT] Input filename: sam_vit_b_prompt_mask.onnx
[07/02/2025-09:09:28] [I] [TRT] ONNX IR version: 0.0.7
[07/02/2025-09:09:28] [I] [TRT] Opset version: 14
[07/02/2025-09:09:28] [I] [TRT] Producer name: pytorch
[07/02/2025-09:09:28] [I] [TRT] Producer version: 2.7.1
[07/02/2025-09:09:28] [I] [TRT] Domain:
[07/02/2025-09:09:28] [I] [TRT] Model version: 0
[07/02/2025-09:09:28] [I] [TRT] Doc string:
[07/02/2025-09:09:28] [I] [TRT] ----------------------------------------------------------------
[07/02/2025-09:09:28] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[07/02/2025-09:09:28] [I] [TRT] /pe_layer/MatMul: broadcasting input1 to make tensors conform, dims(input0)=[64,64,2][NONE] dims(input1)=[1,2,128][NONE].
[07/02/2025-09:09:28] [I] [TRT] /pe_layer/MatMul: broadcasting input1 to make tensors conform, dims(input0)=[64,64,2][NONE] dims(input1)=[1,2,128][NONE].
[07/02/2025-09:09:28] [I] [TRT] /pe_layer/MatMul: broadcasting input1 to make tensors conform, dims(input0)=[64,64,2][NONE] dims(input1)=[1,2,128][NONE].
[07/02/2025-09:09:28] [I] [TRT] /pe_layer/MatMul: broadcasting input1 to make tensors conform, dims(input0)=[64,64,2][NONE] dims(input1)=[1,2,128][NONE].
[07/02/2025-09:09:28] [E] Error[4]: [graph.cpp::symbolicExecute::611] Error Code 4: Internal Error (/OneHot: an IIOneHotLayer cannot be used to compute a shape tensor)
[07/02/2025-09:09:28] [E] [TRT] parsers/onnx/ModelImporter.cpp:726: While parsing node number 146 [Tile -> "/Tile_output_0"]:
[07/02/2025-09:09:28] [E] [TRT] parsers/onnx/ModelImporter.cpp:727: --- Begin node ---
[07/02/2025-09:09:28] [E] [TRT] parsers/onnx/ModelImporter.cpp:728: input: "/Unsqueeze_3_output_0"
input: "/Reshape_2_output_0"
output: "/Tile_output_0"
name: "/Tile"
op_type: "Tile"

[07/02/2025-09:09:28] [E] [TRT] parsers/onnx/ModelImporter.cpp:729: --- End node ---
[07/02/2025-09:09:28] [E] [TRT] parsers/onnx/ModelImporter.cpp:731: ERROR: parsers/onnx/ModelImporter.cpp:185 In function parseGraph:
[6] Invalid Node - /Tile
[graph.cpp::symbolicExecute::611] Error Code 4: Internal Error (/OneHot: an IIOneHotLayer cannot be used to compute a shape tensor)
[07/02/2025-09:09:28] [E] Failed to parse onnx file
[07/02/2025-09:09:28] [I] Finish parsing network model
[07/02/2025-09:09:28] [E] Parsing model failed
[07/02/2025-09:09:28] [E] Failed to create engine from model or file.
[07/02/2025-09:09:28] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8501] # trtexec --onnx=sam_vit_b_prompt_mask.onnx --saveEngine=sam_vit_b_prompt_mask.engine --workspace=4096

I hope you can help me explain what this is and how to solve it, thank you

Metadata

Metadata

Assignees

No one assigned

    Labels

    Module:Engine BuildIssues with building TensorRT enginesModule:ONNXIssues relating to ONNX usage and importwaiting for feedbackRequires more information from author of item to make progress on the issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions