Skip to content

Fix Metal MPS encoder lifecycle and broaden macOS compatibility#1

Closed
robtaylor wants to merge 2 commits intomainfrom
fix-metal-mps-lifecycle
Closed

Fix Metal MPS encoder lifecycle and broaden macOS compatibility#1
robtaylor wants to merge 2 commits intomainfrom
fix-metal-mps-lifecycle

Conversation

@robtaylor
Copy link

CI test for huggingface#308

Summary

  • Fix MPS encoder lifecycle crash on sequential kernel calls
  • Lower Metal standard to metal3.1 (macOS 14+)
  • Multi-strategy Metal toolchain detection (macOS 14/15/26)
  • macOS CI matrix: macos-14, macos-15, macos-26-xlarge

Use stream->commandEncoder() instead of creating encoders directly via
[cmdBuf computeCommandEncoder] to properly integrate with PyTorch's MPS
stream encoder lifecycle management (kernel coalescing). Direct encoder
creation bypasses the stream's internal _commandEncoder state and crashes
on sequential kernel dispatches.

Lower the default Metal standard from metal3.2 (macOS 15+) to metal3.1
(macOS 14+) since all current kernel features (bfloat16_t, simd_sum,
simd_shuffle, threadgroup_barrier) are available in Metal 3.1.

Add multi-strategy Metal toolchain detection for macOS 14+:
- Separate Metal toolchain component (macOS 26+ cryptex mount)
- xcrun/xcode-select based detection
- Direct /Applications/Xcode*.app filesystem scan fallback

Also clear SDKROOT in xcrunHost to prevent Nix-set SDK paths from
interfering with system xcrun.

Fixes: huggingface#307

Co-developed-by: Claude Code v2.1.50 (claude-opus-4-6)
Test Metal kernel builds across multiple macOS versions to verify
compatibility with the metal3.1 standard (macOS 14+). Use sandbox=relaxed
for Nix to support __noChroot builds that access the host Metal toolchain.
The separate Metal toolchain download is only needed on macOS 26+.

Co-developed-by: Claude Code v2.1.50 (claude-opus-4-6)
@robtaylor
Copy link
Author

Superseded by new PR from metal-stack branch with stacked patches

@robtaylor robtaylor closed this Mar 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant