WIP - removeBinary #414

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

SushmitaThakallapalli1980 wants to merge 9 commits into Xilinx:feature/onnx-to-tosa from SushmitaThakallapalli1980:feature/onnx-to-tosa

SushmitaThakallapalli1980 commented Aug 20, 2025 •

edited

Loading

REMOVE_BINARY summary:

This RemoveBinary class is a pattern-matching and rewrite utility that finds and removes binary ops (Add, Sub, Mul, Div) in quantized ONNX graphs when one input is effectively a constant. It folds the constant into the quantization parameters instead of keeping an explicit binary op in the graph.

This class rewrites patterns like:

 Dequant(x) ----\
                             Binary(Add/Sub/Mul/Div) → Quant → Dequant
Dequant(const)-/

into something like

  Dequant'(x, new_scale/zp) → Quant → Dequant
  
  Where the constant is folded into scale or zero-point. 
  
  Then, a check is performed to see if we can possibly remove the Quant → Dequant chain.

SushmitaThakallapalli1980 requested a review from tvivies-amd

August 20, 2025 07:54

sushmita and others added 2 commits

August 20, 2025 02:54


          WIP - removeBinary


          private struct created

a482565

tvivies-amd requested a review from xiaohanAMD

August 20, 2025 11:52

tvivies-amd commented Aug 20, 2025

I added @xiaohanAMD as reviewer too since he wrote the initial pattern and some improvements

tvivies-amd reviewed

View reviewed changes

tvivies-amd left a comment

A couple of small comments, I like the usage of if constexpr and the state structure.

src/Dialect/ONNX/Transforms/DQBinaryQOpt.cpp Outdated

+                    }
+                    state.kValue = kValueOpt.value();
+                    // Debug - To be removed

tvivies-amd Aug 20, 2025

Ideally you would not remove the debug ouptut that you used during the implementation, I would prefer if you could log them instead. @ehsan-toosi do you know if there is any logging capabilities in this repo ?

src/Dialect/ONNX/Transforms/DQBinaryQOpt.cpp

Comment on lines +225 to +226

		if (failed(match_qdq(state, dqOp1, dqOp2)))
		return failure();

tvivies-amd Aug 20, 2025

nit: add bracket here too, in order to keep the coding style consistent

Author

SushmitaThakallapalli1980 Aug 22, 2025

ok

src/Dialect/ONNX/Transforms/DQBinaryQOpt.cpp Outdated

Comment on lines 228 to 231

+                    // Debug - To be removed
+                    llvm::outs() << "B. SUCCESS\n";
+                    llvm::outs() << "kValue = " << state.kValue << "\n";
+                    printOnnxNodeName(binaryOp, "[RemoveBinary] matched");

tvivies-amd Aug 20, 2025

I think you can put this in the match_qdq method and return directly the output of the call to match_qdq

src/Dialect/ONNX/Transforms/DQBinaryQOpt.cpp Outdated

Comment on lines 243 to 252

+                  for (Value res : op->getResults()) {
+                    for (Operation *user : res.getUsers()) {
+                      if (auto q = dyn_cast<ONNXQuantizeLinearOp>(user)) {
+                        quantOutputOp = q;
+                        break;
+                      }
+                    }
+                    if (quantOutputOp)
+                      break;
+                  }

tvivies-amd Aug 20, 2025

If the binary op is a qdq operation, I would expect that the output of the binOp only has one user that is the QuantizeLinear operation. This is what the python implementation is expecting and it is fine if we have the same check here

Author

SushmitaThakallapalli1980 Aug 22, 2025

ok.

src/Dialect/ONNX/Transforms/DQBinaryQOpt.cpp

+                      .add<FoldBinaryThroughQDQ<ONNXDivOp>, FoldBinaryThroughQDQ<ONNXSubOp>,
+                          FoldBinaryThroughQDQ<ONNXMulOp>, FoldBinaryThroughQDQ<ONNXAddOp>>(
+                          &getContext());
+                  if (failed(applyPatternsAndFoldGreedily(function, std::move(patterns))))

tvivies-amd Aug 20, 2025

I wonder if we need to use the greedy approach here, since we are not creating any new ops that would needs to be visited by the pass and same for the modified ops they do not need to be matched again. See documentation of the walker configuration: https://mlir.llvm.org/docs/PatternRewriter/#walk-pattern-rewrite-driver
What do you think ?

Author

SushmitaThakallapalli1980 Aug 22, 2025

Tried replacing applyPatternsAndFoldGreedily with walkAndApplyPatterns. But the code is crashing for some test cases. Not sure why. Therefore, restored greedy driver again.


          match, rewrite and tests added

05c0cd8

xiaohanAMD commented Aug 22, 2025 •

edited

Loading

this pattern is very complicated when there is fork in the match chain. Please add test case to fix this scenario, here Q1 has a fork, we expect not to fold into DQ1, but in Q2:

        const ---\
                 |
                 v
Q1 ---> DQ1 -> BinOp -> Q2 -> DQ2
    |
    \-> something else

Author

SushmitaThakallapalli1980 commented Aug 22, 2025

this pattern is very complicated when there is fork in the match chain. Please add test case to fix this scenario, here Q1 has a fork, we expect not to fold into DQ1, but in Q2:
        const ---\
                 |
                 v
Q1 ---> DQ1 -> BinOp -> Q2 -> DQ2
    |
    \-> something else

@xiaohanAMD
In the following https://jira.xilinx.com/browse/AIESW-8092, as per the initial scoping, folding is expected only on DQ1...Is this not the case?

xiaohanAMD commented Aug 22, 2025

this pattern is very complicated when there is fork in the match chain. Please add test case to fix this scenario, here Q1 has a fork, we expect not to fold into DQ1, but in Q2:
        const ---\
                 |
                 v
Q1 ---> DQ1 -> BinOp -> Q2 -> DQ2
    |
    \-> something else
@xiaohanAMD In the following https://jira.xilinx.com/browse/AIESW-8092, as per the initial scoping, folding is expected only on DQ1...Is this not the case?

I didn't see that, sure, let's implement the basic in this PR. More complicate case we can do later.

SushmitaThakallapalli1980 and others added 3 commits

August 22, 2025 10:36


          Merge branch 'feature/onnx-to-tosa' into feature/onnx-to-tosa

c78356d


          fix compile errors

323b0a7


          compiler bug-fix

8940de9

SushmitaThakallapalli1980 requested a review from jorickert

August 22, 2025 16:48


          added test cases. code bug-fixes

580d060

jorickert closed this

jorickert reopened this

sushmita and others added 2 commits

August 25, 2025 09:41


          mlir tests - bug fixes

0d16428


          Merge branch 'feature/onnx-to-tosa' into feature/onnx-to-tosa

965db7d

SushmitaThakallapalli1980 marked this pull request as ready for review

August 25, 2025 17:54

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet