-
Notifications
You must be signed in to change notification settings - Fork 699
Choosing ops_to_preserve by delegating to pattern
#15121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15121
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 1 Unrelated FailureAs of commit 85c5ac7 with merge base 9413da0 ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
6fabbb4 to
422213b
Compare
Summary: # Context The `trace` function in `compiler.py` returns the static graph representation of the model. Because certain hardwares are very optimized for certain operations, we don't want to decompose those operations in our graph. Thus, currently, the function passes in `ops_to_keep` to `trace_fn` https://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler.py?lines=59-86 which then removes these operations before decomposing the graph: https://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler_funcs.py?lines=33-36 # Issue Right now, the `ops_to_keep` is hardcoded, and there's no easy way to customize which ops to keep. View [this comment thread](https://www.internalfb.com/diff/D81703253?dst_version_fbid=1326590512174768&transaction_fbid=728299956948640) for more details. The tl;dr is that there should be stricter separation of concerns between the compiler and the hardware -- the compiler shouldn't need to know what ops to keep or not. # Possible Solutions | Solution | Pros | Cons | Implementation | | -- | | Delegate `ops_to_keep` to `QuantizationPattern` [**chosen solution**] | Clear separation of concerns, semantically correct, works well with current composition structure of classes, easily extensible | Changes existing behavior. i.e, `CadenceFusedConvReluQuantizer` doesn't have all patterns, but current behavior is that compiler is actually keeping all default ops. Direct inheritance doesn't work due to current behavior. | D84524714 | | Delegate `ops_to_keep` to `CadenceQuantizer` | Keeps existing behavior, simple for users(just add additional ops to keep) | Doesn't work well with more complex compositions(none exist), making inheritance additive is somewhat complicated, not sure if it makes sense semantically, harder to customize | D84461403 [not used] | # This Diff For now, we delegate `ops_to_keep` to `QuantizationPattern`(first solution). We 1. Create a recursive method to get all the operations to preserve, with the base being the pattern defined for a `CadenceAtenQuantizer`. 2. Use our method to find the ops to keep in `compiler.py` 3. Create a unit test Now, the ops_to_keep is defined using the patterns themselves! Differential Revision: D84524714
Summary: # Context The `trace` function in `compiler.py` returns the static graph representation of the model. Because certain hardwares are very optimized for certain operations, we don't want to decompose those operations in our graph. Thus, currently, the function passes in `ops_to_keep` to `trace_fn` https://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler.py?lines=59-86 which then removes these operations before decomposing the graph: https://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler_funcs.py?lines=33-36 # Issue Right now, the `ops_to_keep` is hardcoded, and there's no easy way to customize which ops to keep. View [this comment thread](https://www.internalfb.com/diff/D81703253?dst_version_fbid=1326590512174768&transaction_fbid=728299956948640) for more details. The tl;dr is that there should be stricter separation of concerns between the compiler and the hardware -- the compiler shouldn't need to know what ops to keep or not. # Possible Solutions | Solution | Pros | Cons | Implementation | | -- | | Delegate `ops_to_keep` to `QuantizationPattern` [**chosen solution**] | Clear separation of concerns, semantically correct, works well with current composition structure of classes, easily extensible | Changes existing behavior. i.e, `CadenceFusedConvReluQuantizer` doesn't have all patterns, but current behavior is that compiler is actually keeping all default ops. Direct inheritance doesn't work due to current behavior. | D84524714 | | Delegate `ops_to_keep` to `CadenceQuantizer` | Keeps existing behavior, simple for users(just add additional ops to keep) | Doesn't work well with more complex compositions(none exist), making inheritance additive is somewhat complicated, not sure if it makes sense semantically, harder to customize | D84461403 [not used] | # This Diff For now, we delegate `ops_to_keep` to `QuantizationPattern`(first solution). We 1. Create a recursive method to get all the operations to preserve, with the base being the pattern defined for a `CadenceAtenQuantizer`. 2. Use our method to find the ops to keep in `compiler.py` 3. Create a unit test Now, the ops_to_keep is defined using the patterns themselves! Reviewed By: DrJessop, hsharma35 Differential Revision: D84524714
422213b to
85c5ac7
Compare
Summary:
Context
The
tracefunction incompiler.pyreturns the static graph representation of the model. Because certain hardwares are very optimized for certain operations, we don't want to decompose those operations in our graph.Thus, currently, the function passes in
ops_to_keeptotrace_fnhttps://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler.py?lines=59-86
which then removes these operations before decomposing the graph:
https://www.internalfb.com/code/fbsource/[ada71b37b46a898065851182cd48b48b321a5d12]/fbcode/executorch/backends/cadence/aot/compiler_funcs.py?lines=33-36
Issue
Right now, the
ops_to_keepis hardcoded, and there's no easy way to customize which ops to keep. View this comment thread for more details.The tl;dr is that there should be stricter separation of concerns between the compiler and the hardware -- the compiler shouldn't need to know what ops to keep or not.
Possible Solutions
| Solution | Pros | Cons | Implementation |
| -- |
| Delegate
ops_to_keeptoQuantizationPattern[chosen solution] | Clear separation of concerns, semantically correct, works well with current composition structure of classes, easily extensible | Changes existing behavior. i.e,CadenceFusedConvReluQuantizerdoesn't have all patterns, but current behavior is that compiler is actually keeping all default ops. Direct inheritance doesn't work due to current behavior. | D84524714 || Delegate
ops_to_keeptoCadenceQuantizer| Keeps existing behavior, simple for users(just add additional ops to keep) | Doesn't work well with more complex compositions(none exist), making inheritance additive is somewhat complicated, not sure if it makes sense semantically, harder to customize | D84461403 [not used] |This Diff
For now, we delegate
ops_to_keeptoQuantizationPattern(first solution). WeCadenceAtenQuantizer.compiler.pyNow, the ops_to_keep is defined using the patterns themselves!
Differential Revision: D84524714