[Feat] add bf16 sft to mxfp4 conversion #108
Closed
+363
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add bf16 SFT to mxfp4 conversion
Currently the model can runs in either bf16 (group-wise fp8 is also possible) data type in H800/H100 or mxfp4 data type in Blackwell.
After sft the GPT-OSS model and injecting the new identity, we need to convert the model back to MXFP4 to reduce model size when loading weights from HBM with 4-bit IO (in H800, the model will be converted back to bf16 in runtime).
Verification of correctness
We checked the model end-to-end and compared fp4 weight values par-to-par: