Skip to content

Conversation

@kmehant
Copy link
Collaborator

@kmehant kmehant commented Oct 14, 2025

Summary of changes

  1. gradient clipping when EP + FDSPv2 is used
  2. CPU RAM efficient loading fix patches needed for FSDPv2 + EP turned on (ignored params). Not pushed for merge to upstream since the changes have a assumption on identifying EP parameters by their device placing.

Signed-off-by: Mehant Kammakomati <[email protected]>
Signed-off-by: Mehant Kammakomati <[email protected]>
@kmehant kmehant requested a review from ashokponkumar October 14, 2025 20:37
@kmehant kmehant changed the title FSDP2 with MoE kernels and expert parallel feat : FSDP2 with MoE kernels and expert parallel Oct 14, 2025
@kmehant kmehant changed the title feat : FSDP2 with MoE kernels and expert parallel feat: FSDP2 with MoE kernels and expert parallel Oct 14, 2025
Signed-off-by: Mehant Kammakomati <[email protected]>
Signed-off-by: Mehant Kammakomati <[email protected]>
@kmehant kmehant merged commit f06964d into foundation-model-stack:main Oct 15, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants