[Transforms] Update examples for R4 and `transform_block_size` option by brian-dellabetta · Pull Request #1870 · vllm-project/llm-compressor

brian-dellabetta · 2025-09-26T19:23:39Z

SUMMARY:
Prerequisites:

[Transform] Revert deprecation of TransformScheme.head_dim for compatibility with vllm compressed-tensors#472

This PR updates the SpinQuant and Quip examples to include transform_block_size and the latest R4 feature in SpinQuant. It also reverts the TransformScheme.block_size changes previously introduced into CT, and updated in Pr linked above. While block_size is a more appropriate name, head_dim has already landed in vllm, and it would be too much of a pain to change. Users will rarely create their own TransformScheme anyway.

TEST PLAN:

Both examples run and the saved model can be run in vllm, output is meaningful.
with prints, confirmed hadacore is used for QuIPModifier(rotations=["v", "u"], transform_block_size=64, transform_type="hadamard")
and dense gemm is used for QuIPModifier(rotations=["v", "u"], transform_block_size=64, transform_type="random-hadamard")

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

github-actions · 2025-09-26T19:23:48Z

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

gemini-code-assist · 2025-09-26T19:23:51Z

Summary of Changes

Hello @brian-dellabetta, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on enhancing the usability and consistency of the transformation examples and underlying API. It updates the QuIP and SpinQuant examples to demonstrate new features like transform_block_size and the R4 rotation, while also ensuring that the internal API aligns with external dependencies like vLLM by standardizing parameter names.

Highlights

Example Updates: The QuIP and SpinQuant examples have been updated to incorporate the transform_block_size parameter and the latest R4 rotation feature in SpinQuant.
API Consistency: The TransformScheme.block_size parameter has been reverted to head_dim across relevant modules to maintain consistency with vLLM's existing terminology, despite block_size being a more semantically appropriate name.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request updates the SpinQuant and QuIP examples to use the transform_block_size option and adds support for R4 rotation in the SpinQuant example. The core change is the renaming of the block_size parameter to head_dim in the internal TransformScheme class to maintain compatibility with vllm, as explained in the description. The user-facing API in the modifiers correctly retains the transform_block_size parameter. The changes are consistent, and the examples are updated accordingly. I've identified a couple of areas where validation could be improved to prevent potential runtime errors.

src/llmcompressor/modifiers/transform/quip/base.py

src/llmcompressor/modifiers/transform/spinquant/base.py

shanjiaz

Looks good to me!

fynnsu

Looks good! Added a question about comment below.

examples/transform/spinquant_example.py

src/llmcompressor/modifiers/transform/spinquant/base.py

…1883) SUMMARY: Quick follow-up to recently merged * #1870 Updates our `examples/transform` scripts to - [x] default to `transform_type="hadamard"`, which is preferred so that vllm hadacore kernel is used - [x] default to `tranform_block_size=128`, which is preferred for group-size 128 schemes like W4A16 TEST PLAN: Previously confirmed that hadacore kernel was being invoked for `transform_type="hadamard"` --------- Signed-off-by: Brian Dellabetta <bdellabe@redhat.com> Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> Co-authored-by: Kyle Sayers <kylesayrs@gmail.com> Co-authored-by: Rahul Tuli <rtuli@redhat.com>

…vllm-project#1870) SUMMARY: Prerequisites: - [x] vllm-project/compressed-tensors#472 This PR updates the SpinQuant and Quip examples to include `transform_block_size` and the latest R4 feature in SpinQuant. It also reverts the `TransformScheme.block_size` changes previously introduced into CT, and updated in Pr linked above. While `block_size` is a more appropriate name, `head_dim` has already landed in vllm, and it would be too much of a pain to change. Users will rarely create their own `TransformScheme` anyway. TEST PLAN: - [x] Both examples run and the saved model can be run in vllm, output is meaningful. - [x] with prints, confirmed hadacore is used for `QuIPModifier(rotations=["v", "u"], transform_block_size=64, transform_type="hadamard")` - [x] and dense gemm is used for `QuIPModifier(rotations=["v", "u"], transform_block_size=64, transform_type="random-hadamard")` --------- Signed-off-by: Brian Dellabetta <bdellabe@redhat.com> Signed-off-by: Cassie Jeon <cajeon@redhat.com>

…llm-project#1883) SUMMARY: Quick follow-up to recently merged * vllm-project#1870 Updates our `examples/transform` scripts to - [x] default to `transform_type="hadamard"`, which is preferred so that vllm hadacore kernel is used - [x] default to `tranform_block_size=128`, which is preferred for group-size 128 schemes like W4A16 TEST PLAN: Previously confirmed that hadacore kernel was being invoked for `transform_type="hadamard"` --------- Signed-off-by: Brian Dellabetta <bdellabe@redhat.com> Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> Co-authored-by: Kyle Sayers <kylesayrs@gmail.com> Co-authored-by: Rahul Tuli <rtuli@redhat.com> Signed-off-by: Cassie Jeon <cajeon@redhat.com>

brian-dellabetta added 3 commits September 24, 2025 20:47

update transforms examples with new options

083e687

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

switch block_size back to head_dim

fb73c0d

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

examples comment update

6077d6b

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

brian-dellabetta requested review from dsikka, fynnsu, kylesayrs, rahul-tuli and shanjiaz September 26, 2025 19:23

brian-dellabetta mentioned this pull request Sep 26, 2025

[Transform] Revert deprecation of TransformScheme.head_dim for compatibility with vllm vllm-project/compressed-tensors#472

Merged

gemini-code-assist bot reviewed Sep 26, 2025

View reviewed changes

src/llmcompressor/modifiers/transform/quip/base.py Show resolved Hide resolved

src/llmcompressor/modifiers/transform/spinquant/base.py Show resolved Hide resolved

Merge branch 'main' into bdellabe/transforms-updates

075ac5f

brian-dellabetta mentioned this pull request Sep 29, 2025

Hadamard - padding size to 2^N vllm-project/compressed-tensors#474

Closed

shanjiaz approved these changes Sep 29, 2025

View reviewed changes

fynnsu approved these changes Sep 29, 2025

View reviewed changes

examples/transform/spinquant_example.py Show resolved Hide resolved

kylesayrs approved these changes Sep 29, 2025

View reviewed changes

src/llmcompressor/modifiers/transform/spinquant/base.py Show resolved Hide resolved

brian-dellabetta changed the title ~~[Transforms] Update examples for R3 and transform_block_size option~~ [Transforms] Update examples for R4 and transform_block_size option Sep 29, 2025

Merge branch 'main' into bdellabe/transforms-updates

1ed7409

brian-dellabetta added the ready When a PR is ready for review label Sep 30, 2025

brian-dellabetta enabled auto-merge (squash) September 30, 2025 15:26

brian-dellabetta disabled auto-merge September 30, 2025 15:39

brian-dellabetta merged commit 4c95fd2 into main Sep 30, 2025
9 of 10 checks passed

brian-dellabetta deleted the bdellabe/transforms-updates branch September 30, 2025 15:39

brian-dellabetta mentioned this pull request Sep 30, 2025

[transforms] update examples so hadacore kernel is used by default #1883

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Transforms] Update examples for R4 and `transform_block_size` option#1870

[Transforms] Update examples for R4 and `transform_block_size` option#1870
brian-dellabetta merged 5 commits intomainfrom
bdellabe/transforms-updates

brian-dellabetta commented Sep 26, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 26, 2025

Uh oh!

gemini-code-assist bot commented Sep 26, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

shanjiaz left a comment

Uh oh!

fynnsu left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

brian-dellabetta commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 26, 2025

Uh oh!

gemini-code-assist bot commented Sep 26, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

shanjiaz left a comment

Choose a reason for hiding this comment

Uh oh!

fynnsu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

brian-dellabetta commented Sep 26, 2025 •

edited

Loading