Metal backend: Add operator implementations #15023

manuelcandales · 2025-10-10T21:01:51Z

Adds bfloat16/float32 working implementations of the following AOTI shim ops:

aoti_torch_mps_mm_out
aoti_torch_mps_convolution
aoti_torch_mps__scaled_dot_product_attention_math_for_mps

Adds a stub implementation of aoti_torch_mps_addmm_out

[ghstack-poisoned]

manuelcandales · 2025-10-10T21:01:52Z

Stack from ghstack (oldest at bottom):

pytorch-bot · 2025-10-10T21:01:55Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15023

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 9d69769 with merge base f4d801a ():

NEW FAILURES - The following jobs have failed:

pull / test-multimodal-linux (gemma3-4b) / linux-job (gh)
RuntimeError: Command docker exec -t 5b203c69f2eefedda1db2ef70088be31ee109ef7e0ecfbda22bad448760227e0 /exec failed with exit code 139
pull / unittest-arm-backend-with-no-fvp (test_pytest_ops) / linux-job (gh)
RuntimeError: Command docker exec -t 0e843a1850854952717628079b9a54411945b83b6d701fac188b7941b890cf21 /exec failed with exit code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

backends/apple/metal/runtime/shims/et_metal_ops.mm

mergennachin · 2025-10-12T17:36:31Z

backends/apple/metal/runtime/shims/et_metal_ops.h

+ * ExecutorTorch implementation of aoti_torch_mps_mm_out.
+ * Performs simple matrix multiplication: out = self @ mat2
+ */
+AOTITorchError aoti_torch_mps_mm_out(


Does custom ops use caching mechanism like the ETMetalShaderLibrary?

No, not yet. These fallback ops are implemented using MPSGraph, so, here we would be caching the graph. This is something I want to look into later when optimizing performance. But this deserves time. In particular, since I never understood why MPSGraph operations have a non-trivial CPU overhead in PyTorch, in spite of PyTorch having a caching mechanism for MPSGraphs.

mergennachin · 2025-10-12T17:41:41Z

backends/apple/metal/runtime/shims/et_metal_ops.mm

+        // For attention weights, zero-fill the GPU buffer (shared memory allows CPU memset)
+        std::memset(attn_contents_ptr, 0, attn_size_bytes);


do you need zero filling here

Well, I though it was nicer to return 0, rather than some random stuff.

mergennachin · 2025-10-12T17:46:20Z

backends/apple/metal/runtime/shims/et_metal_ops.mm

+
+        // Set output tensor handles
+        *ret0 = out_tensor_handle;
+        *ret1 = attn_tensor_handle;


Is ret1 actually populated or just zerod

This is just zeroed.
We are using MPSGraph's scaledDotProductAttention which only returns the output tensor.
We need to return an attention tensor because we need to match _scaled_dot_product_attention_math_for_mps signature. But we don't really need it, it gets thrown away here

[ghstack-poisoned]

manuelcandales added 8 commits October 10, 2025 13:29

Update

6420712

[ghstack-poisoned]

Update

d036c07

[ghstack-poisoned]

Update

1a22c5e

[ghstack-poisoned]

Update

d6f0bc9

[ghstack-poisoned]

Update

7e11615

[ghstack-poisoned]

Update

dfa435a

[ghstack-poisoned]

Update

648ee07

[ghstack-poisoned]

Update

3bea537

[ghstack-poisoned]

manuelcandales requested review from cccclai and shoumikhin as code owners October 10, 2025 21:01

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 10, 2025

manuelcandales requested review from larryliu0820 and mergennachin and removed request for cccclai and shoumikhin October 10, 2025 21:03

manuelcandales added 5 commits October 11, 2025 15:47

Update

ca5f1e5

[ghstack-poisoned]

Update

7e971b0

[ghstack-poisoned]

Update

f12117b

[ghstack-poisoned]

Update

5dfcd4f

[ghstack-poisoned]

Update

de83a9f

[ghstack-poisoned]

mergennachin reviewed Oct 12, 2025

View reviewed changes

manuelcandales added 24 commits October 14, 2025 20:57

Update

aea11e8

[ghstack-poisoned]

Update

7f178d3

[ghstack-poisoned]

Update

f46adc5

[ghstack-poisoned]

Update

16d863c

[ghstack-poisoned]

Update

c80142d

[ghstack-poisoned]

Update

c3e9d0a

[ghstack-poisoned]

Update

780d883

[ghstack-poisoned]

Update

b782bb5

[ghstack-poisoned]

Update

cf93ffd

[ghstack-poisoned]

Update

4eaa345

[ghstack-poisoned]

Update

61ead64

[ghstack-poisoned]

Update

750badf

[ghstack-poisoned]

Update

71f87b6

[ghstack-poisoned]

Update

930f6b9

[ghstack-poisoned]

Update

2667a0c

[ghstack-poisoned]

Update

6a6ba04

[ghstack-poisoned]

Update

95a7024

[ghstack-poisoned]

Update

f214162

[ghstack-poisoned]

Update

e8b9828

[ghstack-poisoned]

Update

7c1b9b2

[ghstack-poisoned]

Update

d37e7ef

[ghstack-poisoned]

Update

1506e5f

[ghstack-poisoned]

Update

6f6fd58

[ghstack-poisoned]

Update

4367977

[ghstack-poisoned]

Base automatically changed from gh/manuelcandales/142/head to main October 17, 2025 01:58

Update

9d69769

[ghstack-poisoned]

larryliu0820 approved these changes Oct 17, 2025

View reviewed changes

mergennachin approved these changes Oct 17, 2025

View reviewed changes

manuelcandales merged commit 7b7525e into main Oct 17, 2025
143 of 145 checks passed

manuelcandales deleted the gh/manuelcandales/143/head branch October 17, 2025 14:57

		// For attention weights, zero-fill the GPU buffer (shared memory allows CPU memset)
		std::memset(attn_contents_ptr, 0, attn_size_bytes);

Metal backend: Add operator implementations #15023

Metal backend: Add operator implementations #15023

Uh oh!

Conversation

manuelcandales commented Oct 10, 2025

Uh oh!

manuelcandales commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15023

❌ 2 New Failures

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergennachin Oct 12, 2025

Choose a reason for hiding this comment

Uh oh!

manuelcandales Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

mergennachin Oct 12, 2025

Choose a reason for hiding this comment

Uh oh!

manuelcandales Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

mergennachin Oct 12, 2025

Choose a reason for hiding this comment

Uh oh!

manuelcandales Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

manuelcandales commented Oct 10, 2025 •

edited

Loading

pytorch-bot bot commented Oct 10, 2025 •

edited

Loading