Fix (graph/qronos): Normalize contribution to H and G when buffer is disabled by JP-Amboage · Pull Request #1440 · Xilinx/brevitas

JP-Amboage · 2026-01-08T16:56:49Z

Reason for this PR

In the method Qronos().update_batch() (in the file src/brevitas/graph/qronos.py), in the case where self._use_intermediate_buffer is False, the contribution from the current samples that is being added to the matrices H or G is not normalized. However, the current ("pre-update") value of G or H is normalized (multiplied by the number of samples and divided by the number of samples plus the number of "incoming" samples).

This is a bug as it causes the contribution from samples from the later updates to weight more on the final H and G matrices than the contribution of the earlier samples. However, the order of the samples is not relevant and they should all be equally weighted, as it is done in the case where self._use_intermediate_buffer is True or also in GPTQ (only for H in this case as G is not needed there).

Changes Made in this PR

In the file src/brevitas/graph/qronos.py the line self.G += inp_processed.bmm(self.quant_input.transpose(2, 1)) inside Qronos().update_batch() has been updated so that the incoming contribution is normalized. The new line is

self.G += (inp_processed * (1 / math.sqrt(self.nsamples))).bmm(
                    self.quant_input.transpose(2, 1) * (1 / math.sqrt(self.nsamples)))

The analogous change was done to normalize the contribution to self.H.

Testing Summary

Tests run locally.
Ran the code before and after the fix to quantize a Llama 1B. When quantizing to int4 the perplexity result wasn't affected by the bug/the fix. When quantizing to int2 the quantized perplexity before the fix was 1032 and after fixing this bug it went down to 804.

…alse

pablomlago

Is it possible to avoid having the normalization scattered in multiple places?

src/brevitas/graph/qronos.py

pablomlago

LGTM

normalize new contribution to H and G when use_intermediate_buffer==F…

382d842

…alse

JP-Amboage requested a review from i-colbert January 8, 2026 16:56

normalize matrix before product also in buffer case

50fa14b

JP-Amboage self-assigned this Jan 12, 2026

JP-Amboage requested a review from pablomlago January 12, 2026 11:36

pablomlago requested changes Jan 20, 2026

View reviewed changes

src/brevitas/graph/qronos.py Show resolved Hide resolved

small refactor from pr review

e6809af

JP-Amboage requested a review from pablomlago January 20, 2026 16:26

pablomlago approved these changes Jan 21, 2026

View reviewed changes

Giuseppe5 merged commit 493445c into Xilinx:dev Feb 3, 2026
505 of 506 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix (graph/qronos): Normalize contribution to H and G when buffer is disabled#1440

Fix (graph/qronos): Normalize contribution to H and G when buffer is disabled#1440
Giuseppe5 merged 3 commits intoXilinx:devfrom
JP-Amboage:fix_qronos_norm

JP-Amboage commented Jan 8, 2026

Uh oh!

pablomlago left a comment

Uh oh!

Uh oh!

pablomlago left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

JP-Amboage commented Jan 8, 2026

Reason for this PR

Changes Made in this PR

Testing Summary

Uh oh!

pablomlago left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pablomlago left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants