Skip to content

Conversation

@AdrianLundell
Copy link
Collaborator

@AdrianLundell AdrianLundell commented Oct 2, 2025

Minor included fixes:

  • Make quantized_linear_fusion_pass an XNNPACK pass to initialize it with an exported program
  • Add TO_EXECUTORCH as a valid stage after RUN_PASSES
  • Add ramp_tensor function to simplify creating dummy data

cc @freddan80 @per @zingo @oscarandersson8218 @digantdesai @psiddh

Minor included fixes:
- Make quantized_linear_fusion_pass an XNNPACK pass to
  initialize it with an exported program
- Add TO_EXECUTORCH as a valid stage after RUN_PASSES
- Add ramp_tensor function to simplify creating dummy data

Signed-off-by: Adrian Lundell <[email protected]>
Change-Id: Id13be6427390483aa1df1b76fc363ae4d0eae876
@AdrianLundell AdrianLundell requested a review from cccclai as a code owner October 2, 2025 09:50
@AdrianLundell AdrianLundell added partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm ciflow/trunk release notes: none Do not include this in the release notes labels Oct 2, 2025
@pytorch-bot
Copy link

pytorch-bot bot commented Oct 2, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14746

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures

As of commit 048b0c0 with merge base 75f968d (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 2, 2025
@AdrianLundell
Copy link
Collaborator Author

@digantdesai @psiddh What do you think?

Copy link
Contributor

@psiddh psiddh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, the PR as it covers all basic scenarios given the current state of Passes
xfails properly document known pass limitations

@psiddh
Copy link
Contributor

psiddh commented Oct 9, 2025

Looks like "int32 NameError" specifically happens in the dialect transformation stage for these two test cases: linear_rank2_pos and linear_bias. This appears more fundamental issue , isn't ?

@AdrianLundell
Copy link
Collaborator Author

I think it has to do with how meta-data is propagated in the pass if I remember correctly, the error is fixed in an upcoming patch at least!

@AdrianLundell AdrianLundell merged commit c979158 into pytorch:main Oct 10, 2025
283 of 287 checks passed
@github-project-automation github-project-automation bot moved this from To triage to Done in ExecuTorch Embedded Oct 10, 2025
def __init__(self):
super().__init__(
XNNPACKPassManager, pass_list=[QuantizedOpFusionPass, ReplaceQuantNodesPass]
XNNPACKPassManager,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's fork it and put it under backends/cortex_m

I was thinking s/XNNPACK/MCU but we can do the rebranding later and if you want to stick to cortex_m then s/XNNPACK/CortexM ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with CortexM in here 57a7903 so it will have to be rebranded later

@@ -0,0 +1,211 @@
# Copyright 2025 Arm Limited and/or its affiliates.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

self.check_count(ops_before_transforms)
self.run_passes()
self.check_count(ops_after_transforms)
self.run_method_and_compare_outputs(inputs=self.example_inputs, qtol=qtol)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we merge test_dialect and test_impl?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of like the split since the python implementation is so separate from the c++ implementation and it's nice to get feedback on only the thing you are working on. On the other hand you can simply modify the stages of the tester during development to get this so it should be fine!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given, (1) python eager dq->op->q, (2) python impl of the cortex-m op, and (3) C++ impl of the cortex-m op, I guess it makes sense to do A: (1) vs. (2) and B: (1) vs. (3) but when I suggested to merge I assumed we don't really care about (2) besides unit-tests, unless the second comparison (i.e. B) is (2) vs. (3).

@digantdesai digantdesai added the module: microcontrollers For embedded MCUs like Cortex-M, or RTOS like Zephyr, does not track NPU backend like Arm Ethos. label Oct 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: microcontrollers For embedded MCUs like Cortex-M, or RTOS like Zephyr, does not track NPU backend like Arm Ethos. partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm release notes: none Do not include this in the release notes

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants