-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Description
导出 gpu fp16 onnx 模型遇到过这个错误:
model_fp16 = float16.convert_float_to_float16(model)
File "/data2/projects/environment/miniforge3/envs/weasr/lib/python3.10/site-packages/onnxconverter_common/float16.py", line 264, in convert_float_to_float16
remove_unnecessary_cast_node(model.graph)
File "/data2/projects/environment/miniforge3/envs/weasr/lib/python3.10/site-packages/onnxconverter_common/float16.py", line 776, in remove_unnecessary_cast_node
raise ValueError(
ValueError: The downstream node of the second cast node should be graph output
尝试修改导出参数:
model_fp16 = float16.convert_float_to_float16(
model,
keep_io_types=False, # ← 关键:不保留 IO 类型
min_positive_val=1e-7,
max_finite_val=1e4,
disable_shape_infer=True,
)
以上修改之后gpu fp16 onnx模型可以成功导出,但tritonserver 后端加载模型会报错:
E1127 10:24:46.433533 151 model_repository_manager.cc:487] Invalid argument: ensemble 'streaming_wenet' depends on 'encoder'which has no loaded version. Model 'encoder' loading failed with error: version 1 is at UNAVAILABLE state: Internal: onnx runtime error 1: Load model from /ws/model_repo/encoder/1/encoder.onnx failed:Type Error: Type parameter (T) of Optype (Concat)bound to different types (tensor(float16) and tensor(float) in node (/Concat_1).;
I1127 10:24:46.433665 151 server.cc:563]
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels