Skip to content

Conversation

@RahulC7
Copy link
Contributor

@RahulC7 RahulC7 commented Oct 14, 2025

Summary:

Context

The trace function in compiler.py returns the static graph representation of the model. Because certain hardwares are very optimized for certain operations, we don't want to decompose those operations in our graph.

Thus, currently, the function passes in ops_to_keep to trace_fn

https://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler.py?lines=59-86

which then removes these operations before decomposing the graph:
https://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler_funcs.py?lines=33-36

Issue

Right now, the ops_to_keep is hardcoded, and there's no easy way to customize which ops to keep. View this comment thread for more details.

The tl;dr is that there should be stricter separation of concerns between the compiler and the hardware -- the compiler shouldn't need to know what ops to keep or not.

Possible Solutions

| Solution | Pros | Cons | Implementation |
| -- |
| Delegate ops_to_keep to QuantizationPattern [chosen solution] | Clear separation of concerns, semantically correct, works well with current composition structure of classes, easily extensible | Changes existing behavior. i.e, CadenceFusedConvReluQuantizer doesn't have all patterns, but current behavior is that compiler is actually keeping all default ops. Direct inheritance doesn't work due to current behavior. | D84524714 |
| Delegate ops_to_keep to CadenceQuantizer | Keeps existing behavior, simple for users(just add additional ops to keep) | Doesn't work well with more complex compositions(none exist), making inheritance additive is somewhat complicated, not sure if it makes sense semantically, harder to customize | D84461403 [not used] |

This Diff

For now, we delegate ops_to_keep to QuantizationPattern(first solution). We

  1. Create a recursive method to get all the operations to preserve, with the base being the pattern defined for a CadenceAtenQuantizer.
  2. Use our method to find the ops to keep in compiler.py
  3. Create a unit test

Now, the ops_to_keep is defined using the patterns themselves!

Differential Revision: D84524714

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 14, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15121

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 1 Unrelated Failure

As of commit 85c5ac7 with merge base 9413da0 (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 14, 2025
@meta-codesync
Copy link

meta-codesync bot commented Oct 14, 2025

@RahulC7 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D84524714.

@github-actions
Copy link

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

RahulC7 added a commit to RahulC7/executorch that referenced this pull request Oct 15, 2025
Summary:

# Context
The `trace` function in `compiler.py` returns the static graph representation of the model. Because certain hardwares are very optimized for certain operations, we don't want to decompose those operations in our graph. 

Thus, currently, the function passes in `ops_to_keep` to `trace_fn` 

https://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler.py?lines=59-86

which then removes these operations before decomposing the graph:
https://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler_funcs.py?lines=33-36

# Issue
Right now, the `ops_to_keep` is hardcoded, and there's no easy way to customize which ops to keep. View [this comment thread](https://www.internalfb.com/diff/D81703253?dst_version_fbid=1326590512174768&transaction_fbid=728299956948640) for more details. 

The tl;dr is that there should be stricter separation of concerns between the compiler and the hardware -- the compiler shouldn't need to know what ops to keep or not. 


# Possible Solutions

|  Solution | Pros  | Cons  | Implementation |
| -- |
| Delegate `ops_to_keep` to `QuantizationPattern` [**chosen solution**]  | Clear separation of concerns, semantically correct, works well with current composition structure of classes, easily extensible  | Changes existing behavior. i.e, `CadenceFusedConvReluQuantizer` doesn't have all patterns, but current behavior is that compiler is actually keeping all default ops. Direct inheritance doesn't work due to current behavior. |  D84524714   |
| Delegate `ops_to_keep` to `CadenceQuantizer`  | Keeps existing behavior, simple for users(just add additional ops to keep)   | Doesn't work well with more complex compositions(none exist), making inheritance additive is somewhat complicated,  not sure if it makes sense semantically, harder to customize   | D84461403 [not used] |


# This Diff
For now, we delegate `ops_to_keep` to `QuantizationPattern`(first solution). We
1. Create a recursive method to get all the operations to preserve, with the base being the pattern defined for a `CadenceAtenQuantizer`. 
2. Use our method to find the ops to keep in `compiler.py`
3. Create a unit test

Now, the ops_to_keep is defined using the patterns themselves!

Differential Revision: D84524714
Summary:

# Context
The `trace` function in `compiler.py` returns the static graph representation of the model. Because certain hardwares are very optimized for certain operations, we don't want to decompose those operations in our graph. 

Thus, currently, the function passes in `ops_to_keep` to `trace_fn` 

https://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler.py?lines=59-86

which then removes these operations before decomposing the graph:
https://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler_funcs.py?lines=33-36

# Issue
Right now, the `ops_to_keep` is hardcoded, and there's no easy way to customize which ops to keep. View [this comment thread](https://www.internalfb.com/diff/D81703253?dst_version_fbid=1326590512174768&transaction_fbid=728299956948640) for more details. 

The tl;dr is that there should be stricter separation of concerns between the compiler and the hardware -- the compiler shouldn't need to know what ops to keep or not. 


# Possible Solutions

|  Solution | Pros  | Cons  | Implementation |
| -- |
| Delegate `ops_to_keep` to `QuantizationPattern` [**chosen solution**]  | Clear separation of concerns, semantically correct, works well with current composition structure of classes, easily extensible  | Changes existing behavior. i.e, `CadenceFusedConvReluQuantizer` doesn't have all patterns, but current behavior is that compiler is actually keeping all default ops. Direct inheritance doesn't work due to current behavior. |  D84524714   |
| Delegate `ops_to_keep` to `CadenceQuantizer`  | Keeps existing behavior, simple for users(just add additional ops to keep)   | Doesn't work well with more complex compositions(none exist), making inheritance additive is somewhat complicated,  not sure if it makes sense semantically, harder to customize   | D84461403 [not used] |


# This Diff
For now, we delegate `ops_to_keep` to `QuantizationPattern`(first solution). We
1. Create a recursive method to get all the operations to preserve, with the base being the pattern defined for a `CadenceAtenQuantizer`. 
2. Use our method to find the ops to keep in `compiler.py`
3. Create a unit test

Now, the ops_to_keep is defined using the patterns themselves!

Reviewed By: DrJessop, hsharma35

Differential Revision: D84524714
@meta-codesync meta-codesync bot merged commit 6a0833f into pytorch:main Oct 16, 2025
136 of 140 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants