Skip to content

Conversation

@leejianwoo-collab
Copy link
Contributor

docs: update Megatron-DeepSpeed tutorial to match current repo structure

  • Update outdated file paths and script names in docs/_tutorials/megatron.md.
  • Replace scripts/ with examples/ for training scripts.
  • Replace pretrain_gpt2.py with pretrain_gpt.py.
  • Correct locations for arguments.py and utils.py to megatron/.
  • Ensure tutorial instructions align with the latest Megatron-DeepSpeed repository layout.

Resolves #7757

@sfc-gh-truwase
Copy link
Collaborator

@leejianwoo-collab thanks for the updating the tutorial. Can you please share your usage of Megatron-DeepSpeed? I am curious because we are not aware of much interest in the repo these days.

@leejianwoo-collab
Copy link
Contributor Author

leejianwoo-collab commented Jan 6, 2026

"Thanks @sfc-gh-truwase!

To be honest, I was just going through the tutorials to learn more about training large models with DeepSpeed. I hit some errors because of the file path mismatches while following the Megatron-DeepSpeed guide, so I decided to fix them to help others who might be starting out like me."

if you want konw more detail, you can ask for @kimmeoungjun

@sfc-gh-truwase
Copy link
Collaborator

@leejianwoo-collab thanks for fixing the docs. Can you please resolve the DCO issue to unblock the merging?

@tohtana tohtana enabled auto-merge (squash) January 7, 2026 08:39
auto-merge was automatically disabled January 7, 2026 08:54

Head branch was pushed to by a user without write access

…im (deepspeedai#7763)

fix: avoid IndexError in BF16_Optimizer.destroy() when using DummyOptim

Short-circuit BF16_Optimizer.destroy() if using_real_optimizer is False.
When initialized with optimizer=None (DummyOptim), bf16_groups remains
empty, causing an IndexError when accessing it in destroy().

Resolves deepspeedai#7752

Signed-off-by: leejianwoo-collab <[email protected]>
Signed-off-by: leejianwoo-collab <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Docs: Megatron tutorial references outdated repository structure

3 participants