Skip to content

Conversation

@dushyantbehl
Copy link
Collaborator

@dushyantbehl dushyantbehl commented Sep 1, 2025

Description of the change

  1. Adds code changes required to support the new GPT-OSS models.
  2. Adds a new dockerfile which is based on nvcr for dev mode and supports flash attention 3.
  3. Updates most of the required package versions.
  4. Adds a new quantization class for Mxfp4 which dequantizes the model before training.

Related issue number

How to verify the PR

Was the PR tested

  • I have added >=1 unit test(s) for every new method I have added.
  • I have ensured all unit tests pass

@github-actions
Copy link

github-actions bot commented Sep 1, 2025

Thanks for making a pull request! 😃
One of the maintainers will review and advise on the next steps.

@github-actions github-actions bot added the feat label Sep 1, 2025
enable cuda 12.8 in dockerfile and add triton kernels in pyproject

Signed-off-by: Dushyant Behl <[email protected]>
@dushyantbehl dushyantbehl changed the title feat: Support gpt-oss class of models with flash attention support feat: Support gpt-oss class of models with flash attention 3 support Sep 2, 2025
Copy link
Collaborator

@ashokponkumar ashokponkumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Some suggestions.

Signed-off-by: Harikrishnan Balagopal <[email protected]>
Signed-off-by: Dushyant Behl <[email protected]>
ashokponkumar
ashokponkumar previously approved these changes Sep 2, 2025
Signed-off-by: Dushyant Behl <[email protected]>
@ashokponkumar ashokponkumar merged commit 7e261d2 into main Sep 3, 2025
12 checks passed
@dushyantbehl dushyantbehl deleted the gpt-oss branch November 7, 2025 03:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants