Skip to content

Conversation

@jeffra
Copy link
Collaborator

@jeffra jeffra commented Jan 8, 2026

Attempting to run DeepSpeed (zero stage 0) with fp32 on MPS device. Ran into a few issues, these fixes resolve them.

  • No CUDA-like timer events on MPS, should fall back to host timers
  • When abstract accelerator doesn't define a comm backend we shouldn't trigger a broadcast or all-reduce

@tohtana
Copy link
Collaborator

tohtana commented Jan 8, 2026

Hi @jeffra,
I really appreciate your contribution! This looks good to me, but shouldn’t we show a warning message saying we are skipping weight synchronization and gradient reduction just in case?

@tohtana tohtana merged commit e5b9380 into master Jan 9, 2026
10 checks passed
@tohtana tohtana deleted the jeff/mps-fixes branch January 9, 2026 23:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants