LLM export pass to swap in custom SDPA #10355

sxu · 2025-04-22T16:44:49Z

Differential Revision: D73444078

pytorch-bot · 2025-04-22T16:44:53Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10355

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 40181b2 with merge base 22ba09e ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-04-22T16:44:58Z

This pull request was exported from Phabricator. Differential Revision: D73444078

Summary: Pull Request resolved: pytorch#10355 Differential Revision: D73444078

facebook-github-bot · 2025-04-22T17:05:12Z

This pull request was exported from Phabricator. Differential Revision: D73444078

Summary: Pull Request resolved: pytorch#10355 Differential Revision: D73444078

facebook-github-bot · 2025-04-22T17:16:37Z

This pull request was exported from Phabricator. Differential Revision: D73444078

Summary: Pull Request resolved: pytorch#10355 Differential Revision: D73444078

facebook-github-bot · 2025-04-22T17:35:25Z

This pull request was exported from Phabricator. Differential Revision: D73444078

jackzhxng · 2025-04-22T20:01:22Z

extension/llm/export/export_passes.py

+    def call_operator(self, op, args, kwargs, meta):
+        from executorch.extension.llm.custom_ops import custom_ops  # noqa
+
+        if op != torch.ops.aten.scaled_dot_product_attention.default:


Isn't this op getting decomposed?

The idea is to run this pass before to_edge, and avoid the decomposed version for perf reasons.

extension/llm/export/export_passes.py

kimishpatel · 2025-04-23T02:48:57Z

extension/llm/export/export_passes.py

+        kT = self._transpose(k, meta)
+        vT = self._transpose(v, meta)
+
+        if mask is not None and mask.node.meta["val"].dtype == torch.bool:


Put a todo here that custom sdpa once supports boolean mask, this wont be needed. tag me on the todo

kimishpatel · 2025-04-23T02:49:43Z

extension/llm/export/export_passes.py

+                (mask, 0.0, float("-inf")),
+                {},
+                meta,
+            )


Also worth checking if the mask is > 2D than add appropriate squeeze ops while making sure first N - 2 dims are all 1

kimishpatel · 2025-04-23T02:50:45Z

extension/llm/export/export_passes.py

+                meta,
+            )
+
+        custom_sdpa = super().call_operator(


I would like to add option here that allows us to assume that the mask will be causal and thus we can just set mask =None and is_causal = True, can you do that and add corresponding test?

kimishpatel

Some nits and special handling for mask and is_causal requested

kimishpatel · 2025-04-23T02:51:59Z

cc: @guangy10 @larryliu0820

Summary: Pull Request resolved: pytorch#10355 Reviewed By: billmguo Differential Revision: D73444078

facebook-github-bot · 2025-04-24T17:12:16Z

This pull request was exported from Phabricator. Differential Revision: D73444078

sxu · 2025-04-28T16:11:09Z

@kimishpatel can you take another look?

kimishpatel

Thanks for the changes. Looks good. @guangy10 we should look into adopting this as well.

Summary: Pull Request resolved: pytorch#10355 Reviewed By: billmguo, kimishpatel Differential Revision: D73444078

facebook-github-bot · 2025-04-29T03:01:16Z

This pull request was exported from Phabricator. Differential Revision: D73444078

sxu requested review from iseeyuan, jackzhxng, larryliu0820 and swolchok as code owners April 22, 2025 16:44

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 22, 2025

facebook-github-bot added the fb-exported label Apr 22, 2025

sxu added the topic: not user facing label Apr 22, 2025

sxu added a commit to sxu/executorch that referenced this pull request Apr 22, 2025

LLM export pass to swap in custom SDPA (pytorch#10355)

e8fa403

Summary: Pull Request resolved: pytorch#10355 Differential Revision: D73444078

sxu force-pushed the export-D73444078 branch from d266e75 to e8fa403 Compare April 22, 2025 17:05

sxu added a commit to sxu/executorch that referenced this pull request Apr 22, 2025

LLM export pass to swap in custom SDPA (pytorch#10355)

785b463

Summary: Pull Request resolved: pytorch#10355 Differential Revision: D73444078

sxu force-pushed the export-D73444078 branch from e8fa403 to 785b463 Compare April 22, 2025 17:16

sxu added a commit to sxu/executorch that referenced this pull request Apr 22, 2025

LLM export pass to swap in custom SDPA (pytorch#10355)

2fa9f60

Summary: Pull Request resolved: pytorch#10355 Differential Revision: D73444078

sxu force-pushed the export-D73444078 branch from 785b463 to 2fa9f60 Compare April 22, 2025 17:35

sxu requested a review from kimishpatel April 22, 2025 17:58

jackzhxng reviewed Apr 22, 2025

View reviewed changes

kimishpatel reviewed Apr 23, 2025

View reviewed changes

extension/llm/export/export_passes.py Show resolved Hide resolved

kimishpatel reviewed Apr 23, 2025

View reviewed changes

extension/llm/export/export_passes.py Show resolved Hide resolved

kimishpatel reviewed Apr 23, 2025

View reviewed changes

kimishpatel requested changes Apr 23, 2025

View reviewed changes

sxu force-pushed the export-D73444078 branch from 2fa9f60 to 05eaf00 Compare April 24, 2025 17:12

sxu added a commit to sxu/executorch that referenced this pull request Apr 24, 2025

LLM export pass to swap in custom SDPA (pytorch#10355)

05eaf00

Summary: Pull Request resolved: pytorch#10355 Reviewed By: billmguo Differential Revision: D73444078

sxu requested a review from kimishpatel April 24, 2025 17:20

kimishpatel approved these changes Apr 29, 2025

View reviewed changes

LLM export pass to swap in custom SDPA (pytorch#10355)

40181b2

Summary: Pull Request resolved: pytorch#10355 Reviewed By: billmguo, kimishpatel Differential Revision: D73444078

sxu force-pushed the export-D73444078 branch from 05eaf00 to 40181b2 Compare April 29, 2025 03:00

facebook-github-bot merged commit 7054b1f into pytorch:main Apr 29, 2025
84 of 86 checks passed

LLM export pass to swap in custom SDPA #10355

LLM export pass to swap in custom SDPA #10355

Uh oh!

Conversation

sxu commented Apr 22, 2025

Uh oh!

pytorch-bot bot commented Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10355

✅ No Failures

Uh oh!

facebook-github-bot commented Apr 22, 2025

Uh oh!

facebook-github-bot commented Apr 22, 2025

Uh oh!

facebook-github-bot commented Apr 22, 2025

Uh oh!

facebook-github-bot commented Apr 22, 2025

Uh oh!

jackzhxng Apr 22, 2025

Choose a reason for hiding this comment

Uh oh!

sxu Apr 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kimishpatel Apr 23, 2025

Choose a reason for hiding this comment

Uh oh!

kimishpatel Apr 23, 2025

Choose a reason for hiding this comment

Uh oh!

kimishpatel Apr 23, 2025

Choose a reason for hiding this comment

Uh oh!

kimishpatel left a comment

Choose a reason for hiding this comment

Uh oh!

kimishpatel commented Apr 23, 2025

Uh oh!

facebook-github-bot commented Apr 24, 2025

Uh oh!

sxu commented Apr 28, 2025

Uh oh!

kimishpatel left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Apr 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pytorch-bot bot commented Apr 22, 2025 •

edited

Loading