-
Notifications
You must be signed in to change notification settings - Fork 363
Description
请问有没有朋友有onnx的转换脚本,我写的转换脚本转出来的onnx模型转om时在NonMaxSuppressionV6Fusion(非极大值抑制 V6 融合)这一步骤失败,原因是score 的 shape 出现了负值,有没有朋友提供一下改进思路或者提供一份onnx转换代码
atc --model=faster_rcnn_600x600.onnx --framework=5 --output=model_rcnn --input_
format=NCHW --soc_version=Ascend310B4 --input_shape="input:1,3,600,600"
ATC start working now, please wait for a moment.
...
ATC run failed, Please check the detail log, Try 'atc --help' for more information
E20007: Failed to run graph fusion pass [NonMaxSuppressionV6Fusion]. The pass type is [built-in-ai-core-graph-pass]
Solution: 1. If the pass code is custom, check the error log and the verification logic. 2. If the pass code is not custom, perform a complete or partial dump by using npucollect.sh and then send the dump to Huawei technical support for fault locating.
TraceBack (most recent call last):
The shape of score cannot be negative.[FUNC:IdxValueConstNode][FILE:non_max_suppression_fusion_pass.cc][LINE:121]
generate const value of idx fail[FUNC:Fusion][FILE:non_max_suppression_fusion_pass.cc][LINE:211]
Failed to run graph fusion pass [NonMaxSuppressionV6Fusion]. The pass type is [built-in-ai-core-graph-pass]
[GraphOpt][FirstRoundFusion] Run graph fusion pass failed, pass name:NonMaxSuppressionV6Fusion, pass type:built-in-ai-core-graph-pass, return value is 4294967295.[FUNC:RunOnePassFusion][FILE:graph_fusion.cc][LINE:1170]
[GraphOpt][FirstRoundFusion] MainGraph[model_rcnn]: RunGraphFusion not success.[FUNC:Fusion][FILE:graph_fusion.cc][LINE:99]
[GraphOpt][AfterFusion]Failed to do graph fusion for graph model_rcnn. ErrNo is 4294967295.[FUNC:OptimizeOriginalGraph][FILE:fe_graph_optimizer.cc][LINE:340]
Call OptimizeOriginalGraph failed, ret:-1, engine_name:AIcoreEngine, graph_name:model_rcnn[FUNC:OptimizeOriginalGraph][FILE:graph_optimize.cc][LINE:178]
build graph failed, graph id:0, ret:-1[FUNC:BuildModelWithGraphId][FILE:ge_generator.cc][LINE:1615]
GenerateOfflineModel execute failed.
这是我的onnx转换代码:
import torch
import numpy as np
from nets.frcnn import FasterRCNN # 导入你训练时使用的自定义FasterRCNN类
from utils.utils import get_classes, get_new_img_size, preprocess_input # 复用训练代码中的工具函数
def export_frcnn_to_onnx():
# -------------------------- 1. 配置必要参数(与训练代码保持一致)--------------------------
# 从训练代码复制的核心参数,确保与训练时完全匹配
model_path = "logs/ep040-loss1.072-val_loss1.237.pth" # 你的训练权重文件
classes_path = "model_data/voc_classes.txt" # 类别文件(与训练一致)
backbone = "resnet50" # 主干网络(训练时为resnet50)
anchors_size = [4, 16, 32] # 锚点尺寸(训练代码中定义的anchors_size)
input_shape = [600, 600] # 输入尺寸(训练代码中input_shape)
confidence = 0.5 # 置信度阈值(不影响导出,仅用于匹配模型前向逻辑)
nms_iou = 0.30 # NMS阈值(不影响导出,仅用于匹配模型前向逻辑)
cuda = False # 导出时建议用CPU,避免GPU与ONNX兼容性问题
onnx_save_path = "faster_rcnn_600x600.onnx" # 输出ONNX文件名
opset_version = 11 # ONNX算子版本(11+兼容大部分部署框架)
# -------------------------- 2. 初始化模型(复用训练时的结构)--------------------------
# 1. 获取类别数(与训练一致:目标类别数 + 1个背景类)
class_names, num_classes = get_classes(classes_path)
print(f"类别数:{num_classes}({class_names})")
# 2. 初始化自定义FasterRCNN模型(模式为"predict",与预测代码一致)
model = FasterRCNN(
num_classes=num_classes,
mode="predict", # 必须设为predict,匹配前向传播逻辑
anchor_scales=anchors_size, # 锚点尺寸(训练时的anchors_size)
backbone=backbone # 主干网络(resnet50)
)
# 3. 加载训练权重(与预测代码一致的加载逻辑)
device = torch.device("cuda" if cuda and torch.cuda.is_available() else "cpu")
model.load_state_dict(torch.load(model_path, map_location=device))
print(f"成功加载权重:{model_path}")
# 4. 设置模型为评估模式(禁用训练相关层,如Dropout)
model.eval()
if cuda:
model = torch.nn.DataParallel(model) # 若训练时用多GPU,导出时保持一致
# -------------------------- 3. 准备输入数据(匹配模型输入要求)--------------------------
# 1. 模拟一张600x600的RGB图像(与训练输入尺寸一致)
dummy_image = np.ones((input_shape[0], input_shape[1], 3), dtype=np.float32) # (H, W, C)
# 2. 预处理(与训练/预测时的预处理逻辑完全一致):
# - 归一化(preprocess_input)
# - 维度转换(HWC → CHW)
# - 增加batch维度(CHW → BCHW)
dummy_image = preprocess_input(dummy_image) # 复用训练代码的归一化(如减均值、除标准差)
dummy_input = torch.from_numpy(np.transpose(dummy_image, (2, 0, 1))).unsqueeze(0) # (1, 3, 600, 600)
dummy_input = dummy_input.to(device) # 与模型设备一致
# -------------------------- 4. 导出ONNX模型--------------------------
# 动态维度设置:batch_size可动态(0维),高度/宽度固定为600(训练时固定输入尺寸)
dynamic_axes = {
"input": {0: "batch_size"}, # 输入的batch维度动态
"roi_cls_locs": {0: "batch_size"}, # 输出1:建议框调整参数
"roi_scores": {0: "batch_size"}, # 输出2:建议框类别得分
"rois": {0: "batch_size"} # 输出3:建议框坐标
}
# 导出ONNX
torch.onnx.export(
model=model,
args=dummy_input, # 示例输入
f=onnx_save_path, # 输出路径
verbose=False, # 不打印详细日志(True用于调试)
opset_version=opset_version, # 算子版本
training=torch.onnx.TrainingMode.EVAL, # 评估模式导出
do_constant_folding=True, # 启用常量折叠(优化ONNX模型)
input_names=["input"], # 输入节点名(便于部署时识别)
output_names=["roi_cls_locs", "roi_scores", "rois"], # 输出节点名(与模型前向输出对应)
dynamic_axes=None # 动态维度配置
)
print(f"ONNX模型已保存至:{onnx_save_path}")
if __name__ == "__main__":
export_frcnn_to_onnx()