Choosing `ops_to_preserve` by delegating to pattern #15121

RahulC7 · 2025-10-14T19:08:36Z

Summary:

Context

The trace function in compiler.py returns the static graph representation of the model. Because certain hardwares are very optimized for certain operations, we don't want to decompose those operations in our graph.

Thus, currently, the function passes in ops_to_keep to trace_fn

https://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler.py?lines=59-86

which then removes these operations before decomposing the graph:
https://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler_funcs.py?lines=33-36

Issue

Right now, the ops_to_keep is hardcoded, and there's no easy way to customize which ops to keep. View this comment thread for more details.

The tl;dr is that there should be stricter separation of concerns between the compiler and the hardware -- the compiler shouldn't need to know what ops to keep or not.

Possible Solutions

| Solution | Pros | Cons | Implementation |
| -- |
| Delegate ops_to_keep to QuantizationPattern [chosen solution] | Clear separation of concerns, semantically correct, works well with current composition structure of classes, easily extensible | Changes existing behavior. i.e, CadenceFusedConvReluQuantizer doesn't have all patterns, but current behavior is that compiler is actually keeping all default ops. Direct inheritance doesn't work due to current behavior. | D84524714 |
| Delegate ops_to_keep to CadenceQuantizer | Keeps existing behavior, simple for users(just add additional ops to keep) | Doesn't work well with more complex compositions(none exist), making inheritance additive is somewhat complicated, not sure if it makes sense semantically, harder to customize | D84461403 [not used] |

This Diff

For now, we delegate ops_to_keep to QuantizationPattern(first solution). We

Create a recursive method to get all the operations to preserve, with the base being the pattern defined for a CadenceAtenQuantizer.
Use our method to find the ops to keep in compiler.py
Create a unit test

Now, the ops_to_keep is defined using the patterns themselves!

Differential Revision: D84524714

pytorch-bot · 2025-10-14T19:08:40Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15121

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 1 Unrelated Failure

As of commit 85c5ac7 with merge base 9413da0 ():

NEW FAILURES - The following jobs have failed:

pull / test-samsung-models-linux / linux-job (gh)
RuntimeError: Command docker exec -t 81380bb4d56e04a730d29838197026244d4eb0b84b666f85b43ae0cbdd372a5a /exec failed with exit code 1
Test CUDA Builds / export-voxtral-cuda-artifact / linux-job (gh)
RuntimeError: Command docker exec -t d75d33c65023c9a6b68af5422b0dff71d1b3036d02bdfd10ad633c9193e6f565 /exec failed with exit code 2

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest-arm-backend-with-no-fvp (test_pytest_models) / linux-job (gh) (trunk failure)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2025-10-14T19:08:43Z

@RahulC7 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D84524714.

github-actions · 2025-10-14T19:09:19Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Summary: # Context The `trace` function in `compiler.py` returns the static graph representation of the model. Because certain hardwares are very optimized for certain operations, we don't want to decompose those operations in our graph. Thus, currently, the function passes in `ops_to_keep` to `trace_fn` https://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler.py?lines=59-86 which then removes these operations before decomposing the graph: https://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler_funcs.py?lines=33-36 # Issue Right now, the `ops_to_keep` is hardcoded, and there's no easy way to customize which ops to keep. View [this comment thread](https://www.internalfb.com/diff/D81703253?dst_version_fbid=1326590512174768&transaction_fbid=728299956948640) for more details. The tl;dr is that there should be stricter separation of concerns between the compiler and the hardware -- the compiler shouldn't need to know what ops to keep or not. # Possible Solutions | Solution | Pros | Cons | Implementation | | -- | | Delegate `ops_to_keep` to `QuantizationPattern` [**chosen solution**] | Clear separation of concerns, semantically correct, works well with current composition structure of classes, easily extensible | Changes existing behavior. i.e, `CadenceFusedConvReluQuantizer` doesn't have all patterns, but current behavior is that compiler is actually keeping all default ops. Direct inheritance doesn't work due to current behavior. | D84524714 | | Delegate `ops_to_keep` to `CadenceQuantizer` | Keeps existing behavior, simple for users(just add additional ops to keep) | Doesn't work well with more complex compositions(none exist), making inheritance additive is somewhat complicated, not sure if it makes sense semantically, harder to customize | D84461403 [not used] | # This Diff For now, we delegate `ops_to_keep` to `QuantizationPattern`(first solution). We 1. Create a recursive method to get all the operations to preserve, with the base being the pattern defined for a `CadenceAtenQuantizer`. 2. Use our method to find the ops to keep in `compiler.py` 3. Create a unit test Now, the ops_to_keep is defined using the patterns themselves! Differential Revision: D84524714

Summary: # Context The `trace` function in `compiler.py` returns the static graph representation of the model. Because certain hardwares are very optimized for certain operations, we don't want to decompose those operations in our graph. Thus, currently, the function passes in `ops_to_keep` to `trace_fn` https://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler.py?lines=59-86 which then removes these operations before decomposing the graph: https://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler_funcs.py?lines=33-36 # Issue Right now, the `ops_to_keep` is hardcoded, and there's no easy way to customize which ops to keep. View [this comment thread](https://www.internalfb.com/diff/D81703253?dst_version_fbid=1326590512174768&transaction_fbid=728299956948640) for more details. The tl;dr is that there should be stricter separation of concerns between the compiler and the hardware -- the compiler shouldn't need to know what ops to keep or not. # Possible Solutions | Solution | Pros | Cons | Implementation | | -- | | Delegate `ops_to_keep` to `QuantizationPattern` [**chosen solution**] | Clear separation of concerns, semantically correct, works well with current composition structure of classes, easily extensible | Changes existing behavior. i.e, `CadenceFusedConvReluQuantizer` doesn't have all patterns, but current behavior is that compiler is actually keeping all default ops. Direct inheritance doesn't work due to current behavior. | D84524714 | | Delegate `ops_to_keep` to `CadenceQuantizer` | Keeps existing behavior, simple for users(just add additional ops to keep) | Doesn't work well with more complex compositions(none exist), making inheritance additive is somewhat complicated, not sure if it makes sense semantically, harder to customize | D84461403 [not used] | # This Diff For now, we delegate `ops_to_keep` to `QuantizationPattern`(first solution). We 1. Create a recursive method to get all the operations to preserve, with the base being the pattern defined for a `CadenceAtenQuantizer`. 2. Use our method to find the ops to keep in `compiler.py` 3. Create a unit test Now, the ops_to_keep is defined using the patterns themselves! Reviewed By: DrJessop, hsharma35 Differential Revision: D84524714

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 14, 2025

meta-codesync bot added fb-exported meta-exported labels Oct 14, 2025

RahulC7 force-pushed the export-D84524714 branch from 6fabbb4 to 422213b Compare October 15, 2025 15:32

DrJessop approved these changes Oct 15, 2025

View reviewed changes

RahulC7 force-pushed the export-D84524714 branch from 422213b to 85c5ac7 Compare October 16, 2025 14:15

meta-codesync bot merged commit 6a0833f into pytorch:main Oct 16, 2025
136 of 140 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Choosing `ops_to_preserve` by delegating to pattern #15121

Choosing `ops_to_preserve` by delegating to pattern #15121

Uh oh!

RahulC7 commented Oct 14, 2025

Uh oh!

pytorch-bot bot commented Oct 14, 2025 •

edited

Loading

Uh oh!

meta-codesync bot commented Oct 14, 2025

Uh oh!

github-actions bot commented Oct 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Choosing ops_to_preserve by delegating to pattern #15121

Choosing ops_to_preserve by delegating to pattern #15121

Uh oh!

Conversation

RahulC7 commented Oct 14, 2025

Context

Issue

Possible Solutions

This Diff

Uh oh!

pytorch-bot bot commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15121

❌ 2 New Failures, 1 Unrelated Failure

Uh oh!

meta-codesync bot commented Oct 14, 2025

Uh oh!

github-actions bot commented Oct 14, 2025

This PR needs a release notes: label

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Choosing `ops_to_preserve` by delegating to pattern #15121

Choosing `ops_to_preserve` by delegating to pattern #15121

pytorch-bot bot commented Oct 14, 2025 •

edited

Loading

This PR needs a `release notes:` label