Skip to content

feat: int8 granite addon#92

Merged
chichun-charlie-liu merged 2 commits intofoundation-model-stack:mainfrom
andrea-fasoli:int8_granite_addon
Apr 9, 2025
Merged

feat: int8 granite addon#92
chichun-charlie-liu merged 2 commits intofoundation-model-stack:mainfrom
andrea-fasoli:int8_granite_addon

Conversation

@andrea-fasoli
Copy link
Collaborator

Description of the change

  1. modified i8i8 add-ons to support Granite architecture in FMS, by registering the adapter steps and adapter for granite. Also unified registration process across models.
  2. fixed bug in Linear module string representation printing when bias=False.

Was the PR tested

  • I have ensured all unit tests pass

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it required for AIU to make a "no bias" Linear layer into a Linear with bias (whose value is torch.zeros(1) )?

Copy link
Collaborator Author

@andrea-fasoli andrea-fasoli Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's required by the AIU in the sense that the custom op always expects a tensor as bias (not None nor a bool).

We could change W8A8Linear forward

such that if self.bias does not exist we can create a zero tensor on the fly, cast it to the right device, and pass it to the op. I thought it'd be a more impactful overhead than just instantiating it once.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just want to make sure it's a requirement from downstream stack. 8)

@chichun-charlie-liu chichun-charlie-liu merged commit 50b6ce3 into foundation-model-stack:main Apr 9, 2025
11 checks passed
@andrea-fasoli andrea-fasoli deleted the int8_granite_addon branch April 9, 2025 17:58
@chichun-charlie-liu chichun-charlie-liu linked an issue May 8, 2025 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

INT8 AIU support

2 participants