Skip to content
This repository was archived by the owner on Sep 10, 2025. It is now read-only.

Conversation

@gabe-l-hart
Copy link
Contributor

Description

This PR adds support for the dense Granite models in the 3.0 and 3.1 collections

Changes

  • Add granite architecture parameters
  • Use granite multipliers where appropriate only if set in config
  • Add model_params for each of the new granite models
  • Add entries in model.json for each of the new granite models

@pytorch-bot
Copy link

pytorch-bot bot commented Dec 19, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1432

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 82b2adb with merge base cc0ffce (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 19, 2024
This does not yet implement the usage of the new multipliers in the
architecture, so the output is garbage at the moment.

NOTE: There is currently a bug where this model is missing tokenizer.json
in HF, but that should be resolved soon.

Branch: GraniteThreeDenseSupport

Signed-off-by: Gabe Goodhart <[email protected]>
Branch: GraniteThreeDenseSupport

Signed-off-by: Gabe Goodhart <[email protected]>
@gabe-l-hart gabe-l-hart force-pushed the GraniteThreeDenseSupport branch from 4438c6f to 82b2adb Compare December 19, 2024 23:33
@Jack-Khuu
Copy link
Contributor

Just initial glance so far, but wow this is really clean

@Jack-Khuu Jack-Khuu self-requested a review December 20, 2024 00:05
@Jack-Khuu Jack-Khuu merged commit 83314c7 into pytorch:main Dec 20, 2024
51 of 53 checks passed
vmpuri pushed a commit that referenced this pull request Feb 4, 2025
* feat(granite3): Add config plumbing for granite3-2b

This does not yet implement the usage of the new multipliers in the
architecture, so the output is garbage at the moment.

NOTE: There is currently a bug where this model is missing tokenizer.json
in HF, but that should be resolved soon.

Branch: GraniteThreeDenseSupport

Signed-off-by: Gabe Goodhart <[email protected]>

* feat: Use multipliers conditionally in the model architecture

Signed-off-by: Gabe Goodhart <[email protected]>

* feat: Add plumbing for Granite 3.0 8b and 3.1 2b/8b

Branch: GraniteThreeDenseSupport

Signed-off-by: Gabe Goodhart <[email protected]>

---------

Signed-off-by: Gabe Goodhart <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants