Feat (equalize): adding initial support for MixQuant#1448
Open
i-colbert wants to merge 27 commits intoXilinx:devfrom
Open
Feat (equalize): adding initial support for MixQuant#1448i-colbert wants to merge 27 commits intoXilinx:devfrom
i-colbert wants to merge 27 commits intoXilinx:devfrom
Conversation
Giuseppe5
reviewed
Feb 5, 2026
Giuseppe5
reviewed
Feb 5, 2026
pablomlago
reviewed
Feb 6, 2026
pablomlago
reviewed
Feb 6, 2026
pablomlago
reviewed
Feb 6, 2026
pablomlago
reviewed
Feb 6, 2026
pablomlago
reviewed
Feb 6, 2026
pablomlago
reviewed
Feb 6, 2026
pablomlago
reviewed
Feb 6, 2026
This was referenced Feb 9, 2026
97d5a01 to
bbe6651
Compare
Giuseppe5
requested changes
Feb 12, 2026
Giuseppe5
reviewed
Feb 16, 2026
Giuseppe5
reviewed
Feb 16, 2026
Giuseppe5
reviewed
Feb 16, 2026
src/brevitas/graph/permute.py
Outdated
| permute_fn: str = 'massdiff', | ||
| disable_for_fused_rotations: bool = False): | ||
|
|
||
| assert isinstance(rotation, GraphRotationEqualization), "Error: expected GraphPermutationEqualization instance" |
Collaborator
There was a problem hiding this comment.
This can also be None, right?
Collaborator
Author
There was a problem hiding this comment.
This context manager assumes rotations will be applied. If it is None, then it should throw an error. Will add that now.
Giuseppe5
reviewed
Feb 16, 2026
| # Add all sinks from the region to the state | ||
| for sink_name, sink_wrapper in region.sinks.items(): | ||
| module = region.get_module_from_name(sink_name) | ||
| node = find_node_for_module(graph_model, module) |
Collaborator
There was a problem hiding this comment.
I am not a fan of this dance between nodes and modules. Probably we should just extend the region (including find_srcs and find_sinks to simply keep track of the node as well, so that you can also do something like node = region.get_node_from_name(...).
Giuseppe5
reviewed
Feb 16, 2026
| @@ -0,0 +1,22 @@ | |||
| # MixQuant: Pushing the Limits of Block Rotations in Post-Training Quantization | |||
|
|
|||
| 📄 [Paper](https://arxiv.org/pdf/2601.22347) | |||
Collaborator
There was a problem hiding this comment.
Before we merge, I'd like to make sure results are consistent
Collaborator
Author
There was a problem hiding this comment.
I am saving that for the end.
i-colbert
commented
Feb 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reason for this PR
This PR extends graph equalization to enable MixQuant, which calibrates permutations to improve quantization accuracy when using block rotations. If calibrated intentionally (e.g., with mass diffusion), permutations can help balance the distribution of activation magnitudes across blocks prior to rotation. This is particularly beneficial for low-bit quantization (e.g., INT4, FP4) where outlier management is critical.
Changes Made in this PR
New Permutation Infrastructure (
src/brevitas/graph/permute.py):GraphPermutationEqualizationclass to manage permutation computation and applicationmassdiff,zigzag,absmax, andrandomrotate_permute_modecontext manager for unified rotation+permutation workflowCLI Integration (
src/brevitas_examples/llm/llm_args.py):--apply-permuteflag to enable permutation equalization--permute-fnargument to select permutation strategy (default:massdiff)Main Workflow Updates (
src/brevitas_examples/llm/main.py):fused_rotation_no_fx()to support permutation modeUtility Functions (
src/brevitas/graph/utils.py):find_node_for_modulehelper function for graph traversalContext Manager Design: The
rotate_permute_modecontext manager encapsulates the entire workflow:Notable Implementation Details:
Expected Results
MixQuant demonstrates significant improvements over block rotations alone on Llama-3.2-1B-Instruct with W4A4 per-channel quantization. Using block rotations with
block_rotation_dim: 32andmassdiffpermutation strategy, MixQuant achieves:The improvements stem from better activation outlier management through channel permutations that balance magnitude distributions within rotation blocks.
Configuration: Both methods use Qronos for error correction with dynamic per-row activations, MSE weight scales, and fused Hadamard rotations. See
src/brevitas_examples/papers/mixquant/llama3-mixquant-int4.ymlfor the full config. You can run this as:Please use https://github.com/i-colbert/brevitas/tree/mixquant/src/brevitas_examples/papers/mixquant to reproduce the experiments from the paper.
Testing Summary
Added
test_rotate_permute_modetotests/brevitas/graph/test_equalization.pymassdiff,zigzag,absmax, andrandomblock_rotation_dim,disable_block_rotation_for_fused, andexpansion_stepRisk Highlight
Checklist
devbranch.