How to convert fastflow's onnx model to tensorrt model that supports dynamic batch?? #1327

fat-921 · 2023-08-22T10:46:57Z

fat-921
Aug 22, 2023

What is the motivation for this task?

I was trying to convert the fastflow model to a tensorrt model that supports dynamic batch, but there was an error when onnx converted the trt model。

Here are my execution steps：

1、Turn onnx into dynamic batch

    torch.onnx.export(
        model.model,
        torch.zeros((1, 3, *input_size)).to(model.device),
        str(onnx_path),
        opset_version=11,
        input_names=["input"],
        output_names=["output"],
        dynamic_axes={"input": {0: "batch_size"}}, # add this line to support dynamic batch
    )

2、export engine

# input is the input name, 1, 4, 8 to be specified manually, and 256 is the output size
trtexec --onnx=model.onnx --saveEngine=model.engine --minShapes=input:1x3x256x256 --optShapes=input:4x3x256x256 --maxShapes=input:8x3x256x256

3、the following is the error message

&&&& RUNNING TensorRT.trtexec [TensorRT v8403] # trtexec --onnx=model_dynamic_batch.onnx --saveEngine=model_dynamic_batch.engine --minShapes=input:1x3x256x256 --optShapes=input:4x3x256x256 --maxShapes=input:8x3x256x256
[08/22/2023-18:13:35] [I] === Model Options ===
[08/22/2023-18:13:35] [I] Format: ONNX
[08/22/2023-18:13:35] [I] Model: model_dynamic_batch.onnx
[08/22/2023-18:13:35] [I] Output:
[08/22/2023-18:13:35] [I] === Build Options ===
[08/22/2023-18:13:35] [I] Max batch: explicit batch
[08/22/2023-18:13:35] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[08/22/2023-18:13:35] [I] minTiming: 1
[08/22/2023-18:13:35] [I] avgTiming: 8
[08/22/2023-18:13:35] [I] Precision: FP32
[08/22/2023-18:13:35] [I] LayerPrecisions:
[08/22/2023-18:13:35] [I] Calibration:
[08/22/2023-18:13:35] [I] Refit: Disabled
[08/22/2023-18:13:35] [I] Sparsity: Disabled
[08/22/2023-18:13:35] [I] Safe mode: Disabled
[08/22/2023-18:13:35] [I] DirectIO mode: Disabled
[08/22/2023-18:13:35] [I] Restricted mode: Disabled
[08/22/2023-18:13:35] [I] Build only: Disabled
[08/22/2023-18:13:35] [I] Save engine: model_dynamic_batch.engine
[08/22/2023-18:13:35] [I] Load engine:
[08/22/2023-18:13:35] [I] Profiling verbosity: 0
[08/22/2023-18:13:35] [I] Tactic sources: Using default tactic sources
[08/22/2023-18:13:35] [I] timingCacheMode: local
[08/22/2023-18:13:35] [I] timingCacheFile:
[08/22/2023-18:13:35] [I] Input(s)s format: fp32:CHW
[08/22/2023-18:13:35] [I] Output(s)s format: fp32:CHW
[08/22/2023-18:13:35] [I] Input build shape: input=1x3x256x256+4x3x256x256+8x3x256x256
[08/22/2023-18:13:35] [I] Input calibration shapes: model
[08/22/2023-18:13:35] [I] === System Options ===
[08/22/2023-18:13:35] [I] Device: 0
[08/22/2023-18:13:35] [I] DLACore:
[08/22/2023-18:13:35] [I] Plugins:
[08/22/2023-18:13:35] [I] === Inference Options ===
[08/22/2023-18:13:35] [I] Batch: Explicit
[08/22/2023-18:13:35] [I] Input inference shape: input=4x3x256x256
[08/22/2023-18:13:35] [I] Iterations: 10
[08/22/2023-18:13:35] [I] Duration: 3s (+ 200ms warm up)
[08/22/2023-18:13:35] [I] Sleep time: 0ms
[08/22/2023-18:13:35] [I] Idle time: 0ms
[08/22/2023-18:13:35] [I] Streams: 1
[08/22/2023-18:13:35] [I] ExposeDMA: Disabled
[08/22/2023-18:13:35] [I] Data transfers: Enabled
[08/22/2023-18:13:35] [I] Spin-wait: Disabled
[08/22/2023-18:13:35] [I] Multithreading: Disabled
[08/22/2023-18:13:35] [I] CUDA Graph: Disabled
[08/22/2023-18:13:35] [I] Separate profiling: Disabled
[08/22/2023-18:13:35] [I] Time Deserialize: Disabled
[08/22/2023-18:13:35] [I] Time Refit: Disabled
[08/22/2023-18:13:35] [I] Inputs:
[08/22/2023-18:13:35] [I] === Reporting Options ===
[08/22/2023-18:13:35] [I] Verbose: Disabled
[08/22/2023-18:13:35] [I] Averages: 10 inferences
[08/22/2023-18:13:35] [I] Percentile: 99
[08/22/2023-18:13:35] [I] Dump refittable layers:Disabled
[08/22/2023-18:13:35] [I] Dump output: Disabled
[08/22/2023-18:13:35] [I] Profile: Disabled
[08/22/2023-18:13:35] [I] Export timing to JSON file:
[08/22/2023-18:13:35] [I] Export output to JSON file:
[08/22/2023-18:13:35] [I] Export profile to JSON file:
[08/22/2023-18:13:35] [I]
[08/22/2023-18:13:35] [I] === Device Information ===
[08/22/2023-18:13:35] [I] Selected Device: NVIDIA GeForce RTX 3060 Ti
[08/22/2023-18:13:35] [I] Compute Capability: 8.6
[08/22/2023-18:13:35] [I] SMs: 38
[08/22/2023-18:13:35] [I] Compute Clock Rate: 1.665 GHz
[08/22/2023-18:13:35] [I] Device Global Memory: 8191 MiB
[08/22/2023-18:13:35] [I] Shared Memory per SM: 100 KiB
[08/22/2023-18:13:35] [I] Memory Bus Width: 256 bits (ECC disabled)
[08/22/2023-18:13:35] [I] Memory Clock Rate: 7.001 GHz
[08/22/2023-18:13:35] [I]
[08/22/2023-18:13:35] [I] TensorRT version: 8.4.3
[08/22/2023-18:13:35] [I] [TRT] [MemUsageChange] Init CUDA: CPU +483, GPU +0, now: CPU 16772, GPU 1205 (MiB)
[08/22/2023-18:13:36] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +368, GPU +104, now: CPU 17322, GPU 1309 (MiB)
[08/22/2023-18:13:36] [I] Start parsing network model
[08/22/2023-18:13:36] [I] [TRT] ----------------------------------------------------------------
[08/22/2023-18:13:36] [I] [TRT] Input filename:   model_dynamic_batch.onnx
[08/22/2023-18:13:36] [I] [TRT] ONNX IR version:  0.0.7
[08/22/2023-18:13:36] [I] [TRT] Opset version:    13
[08/22/2023-18:13:36] [I] [TRT] Producer name:    pytorch
[08/22/2023-18:13:36] [I] [TRT] Producer version: 1.12.0
[08/22/2023-18:13:36] [I] [TRT] Domain:
[08/22/2023-18:13:36] [I] [TRT] Model version:    0
[08/22/2023-18:13:36] [I] [TRT] Doc string:
[08/22/2023-18:13:36] [I] [TRT] ----------------------------------------------------------------
[08/22/2023-18:13:36] [W] [TRT] onnx2trt_utils.cpp:369: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[08/22/2023-18:13:36] [W] [TRT] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped
[08/22/2023-18:13:36] [E] [TRT] ModelImporter.cpp:773: While parsing node number 185 [Slice -> "onnx::Add_407"]:
[08/22/2023-18:13:36] [E] [TRT] ModelImporter.cpp:774: --- Begin node ---
[08/22/2023-18:13:36] [E] [TRT] ModelImporter.cpp:775: input: "onnx::Slice_379"
input: "onnx::Slice_401"
input: "onnx::Slice_1636"
input: "onnx::Slice_1637"
input: "onnx::Slice_406"
output: "onnx::Add_407"
name: "Slice_185"
op_type: "Slice"

[08/22/2023-18:13:36] [E] [TRT] ModelImporter.cpp:776: --- End node ---
[08/22/2023-18:13:36] [E] [TRT] ModelImporter.cpp:779: ERROR: builtin_op_importers.cpp:4067 In function importSlice:
[8] Assertion failed: (axes.allValuesKnown()) && "This version of TensorRT does not support dynamic axes."
[08/22/2023-18:13:36] [E] Failed to parse onnx file
[08/22/2023-18:13:36] [I] Finish parsing network model
[08/22/2023-18:13:36] [E] Parsing model failed
[08/22/2023-18:13:36] [E] Failed to create engine from model or file.
[08/22/2023-18:13:36] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8403] # trtexec --onnx=model_dynamic_batch.onnx --saveEngine=model_dynamic_batch.engine --minShapes=input:1x3x256x256 --optShapes=input:4x3x256x256 --maxShapes=input:8x3x256x256

Describe the solution you'd like

1、The wrong message is that there is a problem with Slice op，but I don't know how to do it.
2、How can fastflow be modified to support dynamic batch ?

Additional context

No response

Answered by fat-921

Aug 23, 2023

Thanks response!
I exported the onnx model and use onnxsim.simplify it to convert it to engine model, this could be a reference

    onnx_path = export_path / "model.onnx"
    torch.onnx.export(
        model.model,
        torch.zeros((1, 3, *input_size)).to(model.device),
        str(onnx_path),
        opset_version=11,
        input_names=["input"],
        output_names=["output"],
        dynamic_axes={"input": {0: "batch_size"}},  # add this line to support dynamic batch
    )

    # 简化模型simplify
    model_ = onnx.load(onnx_path)
    model_simp, check = simplify(model_)
    assert check, "Simplified ONNX model could not be validated"
    onnx.save(model_simp, onnx_path)

View full answer

samet-akcay · 2023-08-22T14:59:03Z

samet-akcay
Aug 22, 2023
Maintainer

I'm not sure if anyone from our team is familiar with TensorRT. We would welcome the community support on this.

0 replies

fat-921 · 2023-08-23T09:10:57Z

fat-921
Aug 23, 2023
Author

Thanks response!
I exported the onnx model and use onnxsim.simplify it to convert it to engine model, this could be a reference

    onnx_path = export_path / "model.onnx"
    torch.onnx.export(
        model.model,
        torch.zeros((1, 3, *input_size)).to(model.device),
        str(onnx_path),
        opset_version=11,
        input_names=["input"],
        output_names=["output"],
        dynamic_axes={"input": {0: "batch_size"}},  # add this line to support dynamic batch
    )

    # 简化模型simplify
    model_ = onnx.load(onnx_path)
    model_simp, check = simplify(model_)
    assert check, "Simplified ONNX model could not be validated"
    onnx.save(model_simp, onnx_path)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to convert fastflow's onnx model to tensorrt model that supports dynamic batch?? #1327

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to convert fastflow's onnx model to tensorrt model that supports dynamic batch?? #1327

Uh oh!

Uh oh!

fat-921 Aug 22, 2023

What is the motivation for this task?

Describe the solution you'd like

Additional context

Replies: 2 comments

Uh oh!

Uh oh!

samet-akcay Aug 22, 2023 Maintainer

Uh oh!

fat-921 Aug 23, 2023 Author

fat-921
Aug 22, 2023

samet-akcay
Aug 22, 2023
Maintainer

fat-921
Aug 23, 2023
Author