Skip to content

Conversation

@kmehant
Copy link
Collaborator

@kmehant kmehant commented Mar 6, 2025

Add a wrapper function to wrap complete dcp to safetensor logic in order to enable easy downstream usage through this function without needing to redundantly repeat the same set of steps.

@kmehant kmehant force-pushed the moe-hf-chkpt branch 4 times, most recently from b2f6f62 to be45088 Compare March 6, 2025 08:08
@fabianlim
Copy link
Contributor

@kmehant actually this design goes agains the ADR. we purposedly did not want fms-acceleration to modify the trainer. I feel the callback should be better installed in fms-hf-tuning

@kmehant kmehant force-pushed the moe-hf-chkpt branch 3 times, most recently from 196cca5 to b18be35 Compare March 6, 2025 08:47
Signed-off-by: Mehant Kammakomati <[email protected]>
@kmehant kmehant changed the title feat: Support MoE HF checkpointing chore: Move dcp to safetensors to a wrapper function for downstream usage Mar 6, 2025
@kmehant kmehant marked this pull request as ready for review March 6, 2025 08:52
@kmehant kmehant requested a review from fabianlim as a code owner March 6, 2025 08:52
@kmehant
Copy link
Collaborator Author

kmehant commented Mar 6, 2025

@fabianlim requesting your review. Thanks

Copy link
Contributor

@fabianlim fabianlim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for taking my comment into account to not change the API. LGTM

@kmehant
Copy link
Collaborator Author

kmehant commented Mar 6, 2025

@fabianlim or @willmj Shall we merge this?

@fabianlim
Copy link
Contributor

Let @willmj look at it first

@kmehant kmehant requested a review from willmj March 6, 2025 15:36
Copy link
Collaborator

@willmj willmj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks

@fabianlim fabianlim merged commit f7210f7 into foundation-model-stack:main Mar 7, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants