Skip to content

GPU fp16 onnx模型导出问题 #2794

@zyb8543d

Description

@zyb8543d

导出 gpu fp16 onnx 模型遇到过这个错误:

    model_fp16 = float16.convert_float_to_float16(model)
  File "/data2/projects/environment/miniforge3/envs/weasr/lib/python3.10/site-packages/onnxconverter_common/float16.py", line 264, in convert_float_to_float16
    remove_unnecessary_cast_node(model.graph)
  File "/data2/projects/environment/miniforge3/envs/weasr/lib/python3.10/site-packages/onnxconverter_common/float16.py", line 776, in remove_unnecessary_cast_node
    raise ValueError(
ValueError: The downstream node of the second cast node should be graph output

尝试修改导出参数:

    model_fp16 = float16.convert_float_to_float16(
        model,
       keep_io_types=False,      # ← 关键:不保留 IO 类型
       min_positive_val=1e-7,
         max_finite_val=1e4,
       disable_shape_infer=True,
         )

以上修改之后gpu fp16 onnx模型可以成功导出,但tritonserver 后端加载模型会报错:

E1127 10:24:46.433533 151 model_repository_manager.cc:487] Invalid argument: ensemble 'streaming_wenet' depends on 'encoder'which has no loaded version. Model 'encoder' loading failed with error: version 1 is at UNAVAILABLE state: Internal: onnx runtime error 1: Load model from /ws/model_repo/encoder/1/encoder.onnx failed:Type Error: Type parameter (T) of Optype (Concat)bound to different types (tensor(float16) and tensor(float) in node (/Concat_1).;
I1127 10:24:46.433665 151 server.cc:563]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions