Skip to content

How to fuse QuantizeLinear Node in this case? #4517

@lzcchl

Description

@lzcchl

I'm newer in QAT work, when convert onnx to trtengine, I scan that "ADD" is run in fp16 mode. Do you have any suggestions? and how to place the Q/DQ in right position?

my bash is:

    trtexec \
    --onnx=${onnx_path} \
    --fp16 \
    --int8 \
    --best \
    --verbose \
    --saveEngine=${trt_path} \
    --warmUp=500 \
    --duration=10 \
    --iterations=100 \
    --useCudaGraph \
    --useSpinWait \
    --noDataTransfers \
    --profilingVerbosity=detailed \
    --minShapes=images:1x3x40x40 \
    --optShapes=images:1x3x640x640 \
    --maxShapes=images:1x3x640x640 \
    > verbose.log

this is my onnx graph:

Image

this is trtengine graph:

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    Module:ONNXIssues relating to ONNX usage and importModule:QuantizationIssues related to Quantization

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions