-
Notifications
You must be signed in to change notification settings - Fork 202
Open
Description
使用 W8BF16 量化模型后,发现导出后的 tpu-mlir 缺少了 relu 算子,因为 MatMulLowering::LoweringBF16 没有处理 do_relu=true 的情况。LoweringF16 处理了这个条件,暂时将 W8BF16 切换为 W8F16 避免这个问题。
tpu-mlir/lib/Conversion/TopToTpu/BM1684X/MatMul.cpp
Lines 729 to 742 in a20df03
| if (true == op.getDoRelu()) { | |
| auto name = module::getName(op->getResult(0)); | |
| auto matmul_loc = | |
| NameLoc::get(rewriter.getStringAttr(name.str() + "_a16matmul")); | |
| auto a16matmul_op = rewriter.create<tpu::A16MatMulOp>( | |
| matmul_loc, newType, operands, attrs); | |
| std::vector<NamedAttribute> relu_attrs; | |
| auto relu_limit = | |
| rewriter.getNamedAttr("relu_limit", op.getReluLimitAttr()); | |
| relu_attrs.push_back(relu_limit); | |
| rewriter.replaceOpWithNewOp<tpu::ReluOp>( | |
| op, newType, ValueRange{a16matmul_op.getOutput()}, relu_attrs); | |
| return; | |
| } |
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels