Skip to content

Conversation

@leafs1
Copy link
Contributor

@leafs1 leafs1 commented Jun 3, 2025

Summary

Optimize transposes in XNNPACK partition by adding a new remove_redundant_ops_pass that check for dim order conversion ops that cancel each other. The pass supports non-quantized conversions and also quantized graphs. In the quantized graph case, the conversion nodes and wrapping q/dq nodes will be removed. I also refactored the channels_last_tagged_reshape_pass code by modularizing some functions and adding some setter/getter functions.

This change will improve speed/memory at runtime by not executing redundant to_copy ops that would be there otherwise.

Test plan

Created a TestChannelsLastTaggedReshapePass class which constructs graphs with multiple redundant to_copy ops in different positions and in quantized/non-quantized graphs. These redundant ops are either explicitly stated or generated via other passes. I asserted their removal after the passes finished.

@leafs1 leafs1 requested review from digantdesai and mcr229 as code owners June 3, 2025 17:28
@pytorch-bot
Copy link

pytorch-bot bot commented Jun 3, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11316

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 3 Unrelated Failures

As of commit 4df2fd6 with merge base 9591978 (image):

NEW FAILURE - The following job has failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 3, 2025
@leafs1 leafs1 force-pushed the milestone2.2 branch 5 times, most recently from 98f4027 to 173e41f Compare June 3, 2025 21:47
@leafs1
Copy link
Contributor Author

leafs1 commented Jun 4, 2025

@pytorchbot label "release notes: none"

@pytorch-bot pytorch-bot bot added the release notes: none Do not include this in the release notes label Jun 4, 2025
@leafs1 leafs1 changed the title Milestone2.2 Milestone2.2: Optimize transposes in XNNPACK partition by removing redundant to_copy ops Jun 5, 2025
def input_dim_order(
self, input_node: torch.fx.Node, input_order: InputDimOrder
) -> bool:
if input_node.name == "x":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you replace this with checking if the input_node is a placeholder?

from executorch.exir.passes.memory_format_ops_pass import DimOrderOpsRevertPass


class TestChannelsLastTaggedReshapePass(unittest.TestCase):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test that includes implicitly created dim order conversions? This will check to make sure that user created and pass-created converts get optimized out correctly. I expected it will work, but it would be nice to cover it since this is a common use case.

Maybe something like:
to_channels_last
upsample_nearest2d (not partitioned)
to_channels_first
conv


# If we encounter a to_copy node, check if it is preceded by an opposite to_copy node
if node.target == exir_ops.edge.aten._to_copy.default:
if prev and ChannelsLastTaggedReshapePass.is_nchw_node(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that there may be cases where the using the previous node in the iteration order might not actually be the first arg, especially in more complex graphs. Can you try replacing prev with node.args[0]? That should be sound in all cases.

from executorch.exir.pass_base import PassResult


class RemoveRedundantOpsPass(XNNPACKPass):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: rename this as RemoveRedundantCopyPass or something? This is too generic to infer what its doing

@leafs1 leafs1 force-pushed the milestone2.2 branch 3 times, most recently from 2b4643d to 85e1c4d Compare June 27, 2025 18:32
continue

# If we encounter a to_copy node, check if its input is also a to_copy node with opposite format
if node.target == exir_ops.edge.aten._to_copy.default:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of one more edge case while reading this code. We should probably check to make sure that the second copy is the only user of the first. It's possible to have two copies in a row, but something else could use the output of the first. It's unlikely, but would lead to an invalid graph in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a check, thanks for this find

Copy link
Contributor

@mcr229 mcr229 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few extra test cases to make sure things look ok

module.eval(),
inputs,
)
tester.export().to_edge_transform_and_lower().to_executorch().serialize().run_method_and_compare_outputs()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for complicated paths, can we also try quantized models?

@leafs1 leafs1 force-pushed the milestone2.2 branch 5 times, most recently from 2f21614 to b4d34d8 Compare July 3, 2025 23:02
@leafs1 leafs1 merged commit 4e29bc9 into pytorch:main Jul 12, 2025
94 of 98 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. release notes: none Do not include this in the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants