Skip to content

Conversation

@nikita-savelyevv
Copy link
Collaborator

@nikita-savelyevv nikita-savelyevv commented Aug 28, 2025

Changes

As in the title.

Reason for changes

This PR reduces memory footprint when applying Fast Bias Correction algorithm: collecting raw activations is not required to obtain their shapes. Avoiding using raw reducers allows to save some memory otherwise allocated for the activations.

Example quantization run on vision encoder from OpenGVLab/InternVL2-1B with 4 calibration data samples:

Before After
system_memory_usage_from-zero system_memory_usage_from-zero

Since there is no need to allocate so much memory, statistics collection time also improves.

Related tickets

172800

Tests

Existing tests cover the new changes.

  • NNCF/job/manual/job/post_training_quantization/730
  • NNCF/job/manual/job/post_training_quantization_performance/119/

@github-actions github-actions bot added the NNCF Common Pull request that updates NNCF Common label Aug 28, 2025
@nikita-savelyevv nikita-savelyevv marked this pull request as ready for review August 28, 2025 15:28
@nikita-savelyevv nikita-savelyevv requested a review from a team as a code owner August 28, 2025 15:28
@andrey-churkin andrey-churkin self-requested a review October 10, 2025 11:04
@andrey-churkin andrey-churkin merged commit 24f9c81 into openvinotoolkit:develop Oct 10, 2025
20 checks passed
@andrey-churkin
Copy link
Contributor

WC is passed: https://github.com/openvinotoolkit/nncf/actions/runs/18402892646
@AlexanderDokuchaev If you have any concerns, let us know. They will be addressed in a follow-up PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Code Freeze NNCF Common Pull request that updates NNCF Common

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants