GPU fp16 onnx模型导出问题

导出 gpu fp16 onnx 模型遇到过这个错误：
```
    model_fp16 = float16.convert_float_to_float16(model)
  File "/data2/projects/environment/miniforge3/envs/weasr/lib/python3.10/site-packages/onnxconverter_common/float16.py", line 264, in convert_float_to_float16
    remove_unnecessary_cast_node(model.graph)
  File "/data2/projects/environment/miniforge3/envs/weasr/lib/python3.10/site-packages/onnxconverter_common/float16.py", line 776, in remove_unnecessary_cast_node
    raise ValueError(
ValueError: The downstream node of the second cast node should be graph output
```
尝试修改导出参数：
```
    model_fp16 = float16.convert_float_to_float16(
        model,
       keep_io_types=False,      # ← 关键：不保留 IO 类型
       min_positive_val=1e-7,
         max_finite_val=1e4,
       disable_shape_infer=True,
         )
```
以上修改之后gpu fp16 onnx模型可以成功导出，但tritonserver 后端加载模型会报错：
```
E1127 10:24:46.433533 151 model_repository_manager.cc:487] Invalid argument: ensemble 'streaming_wenet' depends on 'encoder'which has no loaded version. Model 'encoder' loading failed with error: version 1 is at UNAVAILABLE state: Internal: onnx runtime error 1: Load model from /ws/model_repo/encoder/1/encoder.onnx failed:Type Error: Type parameter (T) of Optype (Concat)bound to different types (tensor(float16) and tensor(float) in node (/Concat_1).;
I1127 10:24:46.433665 151 server.cc:563]
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU fp16 onnx模型导出问题 #2794

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GPU fp16 onnx模型导出问题 #2794

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions