Skip to content

v0.18.4 Patch Release

Choose a tag to compare

@loadams loadams released this 07 Jan 22:58
· 64 commits to master since this release
b35d9eb

What's Changed

  • Update version by @sfc-gh-truwase in #7719
  • Disable deterministic option in compile tests by @tohtana in #7720
  • Fix SuperOffloadOptimizer_Stage3 crash due to missing param_names parameter by @ImaGoodFella in #7715
  • [AMD][ROCm] Improve support of AMD by @k-artem in #7448
  • fix typo by @stas00 in #7722
  • Skip none in backward hook by @tohtana in #7725
  • [Engine] Only scale gradients if scale_wrt_gas is True by @kashif in #7724
  • Fix testcases that depends on triton by @k-artem in #7731
  • Fix rare hang in DeepSpeed Async I/O wait by releasing the Python GIL by @xylian86 in #7727
  • Fix #7733: Replace torch.sqrt with math.sqrt in scale_lr for sqrt method by @Rakshit-gen in #7735
  • replace moe checkpoint dp_world_size with seq_dp_world_size by @wukong1992 in #7732
  • [BUG] Fix UlyssesSPAttentionHF.register_with_transformers() crash with PEFT models by @Rakshit-gen in #7737
  • Add core api update blog by @tohtana in #7738
  • Fix Nebula checkpoint engine commit() API mismatch by @Rakshit-gen in #7740
  • Fix DecoupledCheckpointEngine deadlock and improve reliability by @Rakshit-gen in #7742
  • Fix OnebitLamb NaN propagation with empty parameters by @Rakshit-gen in #7736
  • fix: remove premature MPI environment variable check in OpenMPIRunner by @leejianwoo-collab in #7751
  • Enable python 3.11 and 3.12 tests by @loadams in #7007
  • Add CI workflow to run tests on AWS by @tohtana in #7753
  • Add fallback to BF16 support check by @tohtana in #7754
  • Fix DeepCompile for PyTorch 2.8/2.9 compatibility by @tohtana in #7755
  • Removed amp testcases by @k-artem in #7745
  • fix: avoid IndexError in BF16_Optimizer.destroy() when using DummyOptim by @leejianwoo-collab in #7763

New Contributors

Full Changelog: v0.18.3...v0.18.4