Skip to content

Conversation

@RahulC7
Copy link
Contributor

@RahulC7 RahulC7 commented Oct 13, 2025

Summary:

Context

The trace function in compiler.py returns the static graph representation of the model. Because certain hardwares are very optimized for certain operations, we don't want to decompose those operations in our graph.

Thus, currently, the function passes in ops_to_keep to trace_fn

https://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler.py?lines=59-86

which then removes these operations before decomposing the graph:
https://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler_funcs.py?lines=33-36

Issue

Right now, the ops_to_keep is hardcoded, and there's no easy way to customize which ops to keep. View this comment thread for more details.

The tl;dr is that there should be stricter separation of concerns between the compiler and the hardware -- the compiler shouldn't need to know what ops to keep or not.

Possible Solutions

| Solution | Pros | Cons |
| -- |
| Delegate ops_to_keep to QuantizationPattern | Clear separation of concerns, semantically correct, works well with current composition structure of classes, easily extensible | Changes existing behavior. i.e, CadenceFusedConvReluQuantizer doesn't have all patterns, but current behavior is that compiler is actually keeping all default ops. Also, QuantizationPattern isn't owned by us, so we might have to create a wrapper, etc. so could be not worth long-term maintenence. |
| Delegate ops_to_keep to CadenceQuantizer[chosen solution] | Keeps existing behavior, simple for users(just add additional ops to keep) | Doesn't work well with more complex compositions(none exist), making inheritance additive is somewhat complicated, not sure if it makes sense semantically, harder to customize |

This Diff

For now, we delegate ops_to_keep to CadenceQuantizer(second solution). We

  1. Create a method get_ops_to_preserve_from_decomposition, in CadenceQuantizer and fill it with the default ops that's currently set for everything, and create a method _collect_additional_ops that goes through the inheritance chain to add all additional ops to keep, and finally, create the class attribute ADDITIONAL_OPS_TO_PRESERVE for subclasses to override.
  2. Port over current logic into the correct quantizer
  3. Update the trace function in compiler.py to use our method rather than the hardcoding.
  4. Add unit tests to validate changes

Now, if someone has some ops they would like to keep for a quantizer, they just need to update that quantizer's ADDITIONAL_OPS_TO_PRESERVE.

Differential Revision: D84461403

Summary:
# Context
The `trace` function in `compiler.py` returns the static graph representation of the model. Because certain hardwares are very optimized for certain operations, we don't want to decompose those operations in our graph. 

Thus, currently, the function passes in `ops_to_keep` to `trace_fn` 

https://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler.py?lines=59-86

which then removes these operations before decomposing the graph:
https://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler_funcs.py?lines=33-36

# Issue
Right now, the `ops_to_keep` is hardcoded, and there's no easy way to customize which ops to keep. View [this comment thread](https://www.internalfb.com/diff/D81703253?dst_version_fbid=1326590512174768&transaction_fbid=728299956948640) for more details. 

The tl;dr is that there should be stricter separation of concerns between the compiler and the hardware -- the compiler shouldn't need to know what ops to keep or not. 


# Possible Solutions

|  Solution | Pros  | Cons  |
| -- |
| Delegate `ops_to_keep` to `QuantizationPattern`  | Clear separation of concerns, semantically correct, works well with current composition structure of classes, easily extensible  | Changes existing behavior. i.e, `CadenceFusedConvReluQuantizer` doesn't have all patterns, but current behavior is that compiler is actually keeping all default ops. Also, `QuantizationPattern` isn't owned by us, so we might have to create a wrapper, etc. so could be not worth long-term maintenence.  |
| Delegate `ops_to_keep` to `CadenceQuantizer`[**chosen solution**]  | Keeps existing behavior, simple for users(just add additional ops to keep)   | Doesn't work well with more complex compositions(none exist), making inheritance additive is somewhat complicated,  not sure if it makes sense semantically, harder to customize   |


# This Diff
For now, we delegate `ops_to_keep` to `CadenceQuantizer`(second solution). We
1. Create a method `get_ops_to_preserve_from_decomposition`,  in `CadenceQuantizer` and fill it with the default ops that's currently set for everything, and create a method `_collect_additional_ops` that goes through the inheritance chain to add all additional ops to keep, and finally, create the class attribute `ADDITIONAL_OPS_TO_PRESERVE` for subclasses to override. 
2. Port over current logic into the correct quantizer
3. Update the `trace` function in `compiler.py` to use our method rather than the hardcoding.
4. Add unit tests to validate changes 

Now, if someone has some ops they would like to keep for a quantizer, they just need to update that quantizer's `ADDITIONAL_OPS_TO_PRESERVE`.

Differential Revision: D84461403
@pytorch-bot
Copy link

pytorch-bot bot commented Oct 13, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15047

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 7 New Failures, 5 Unrelated Failures

As of commit 3cbf2ad with merge base d00279d (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 13, 2025
@meta-codesync
Copy link

meta-codesync bot commented Oct 13, 2025

@RahulC7 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D84461403.

@github-actions
Copy link

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant