Skip to content

Fix (graph/qronos): Normalize contribution to H and G when buffer is disabled#1440

Merged
Giuseppe5 merged 3 commits intoXilinx:devfrom
JP-Amboage:fix_qronos_norm
Feb 3, 2026
Merged

Fix (graph/qronos): Normalize contribution to H and G when buffer is disabled#1440
Giuseppe5 merged 3 commits intoXilinx:devfrom
JP-Amboage:fix_qronos_norm

Conversation

@JP-Amboage
Copy link
Collaborator

Reason for this PR

In the method Qronos().update_batch() (in the file src/brevitas/graph/qronos.py), in the case where self._use_intermediate_buffer is False, the contribution from the current samples that is being added to the matrices H or G is not normalized. However, the current ("pre-update") value of G or H is normalized (multiplied by the number of samples and divided by the number of samples plus the number of "incoming" samples).

This is a bug as it causes the contribution from samples from the later updates to weight more on the final H and G matrices than the contribution of the earlier samples. However, the order of the samples is not relevant and they should all be equally weighted, as it is done in the case where self._use_intermediate_buffer is True or also in GPTQ (only for H in this case as G is not needed there).

Changes Made in this PR

In the file src/brevitas/graph/qronos.py the line self.G += inp_processed.bmm(self.quant_input.transpose(2, 1)) inside Qronos().update_batch() has been updated so that the incoming contribution is normalized. The new line is

self.G += (inp_processed * (1 / math.sqrt(self.nsamples))).bmm(
                    self.quant_input.transpose(2, 1) * (1 / math.sqrt(self.nsamples)))

The analogous change was done to normalize the contribution to self.H.

Testing Summary

  • Tests run locally.
  • Ran the code before and after the fix to quantize a Llama 1B. When quantizing to int4 the perplexity result wasn't affected by the bug/the fix. When quantizing to int2 the quantized perplexity before the fix was 1032 and after fixing this bug it went down to 804.

@JP-Amboage JP-Amboage requested a review from i-colbert January 8, 2026 16:56
@JP-Amboage JP-Amboage self-assigned this Jan 12, 2026
@JP-Amboage JP-Amboage requested a review from pablomlago January 12, 2026 11:36
Copy link
Collaborator

@pablomlago pablomlago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to avoid having the normalization scattered in multiple places?

@JP-Amboage JP-Amboage requested a review from pablomlago January 20, 2026 16:26
Copy link
Collaborator

@pablomlago pablomlago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Giuseppe5 Giuseppe5 merged commit 493445c into Xilinx:dev Feb 3, 2026
505 of 506 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants