You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Don't use RawReducer for activation shape collection in Fast Bias Correction (#3642)
### Changes
As in the title.
### Reason for changes
This PR reduces memory footprint when applying Fast Bias Correction
algorithm: collecting raw activations is not required to obtain their
shapes. Avoiding using raw reducers allows to save some memory otherwise
allocated for the activations.
Example quantization run on vision encoder from `OpenGVLab/InternVL2-1B`
with 4 calibration data samples:
| Before | After |
|-|-|
| <img width="1000" height="600" alt="system_memory_usage_from-zero"
src="https://github.com/user-attachments/assets/73354e2f-db21-48a9-8c8a-b5a80426b41a"
/> | <img width="1000" height="600" alt="system_memory_usage_from-zero"
src="https://github.com/user-attachments/assets/a5343cd6-7bef-413a-8688-427b901194b7"
/> |
Since there is no need to allocate so much memory, statistics collection
time also improves.
### Related tickets
172800
### Tests
Existing tests cover the new changes.
- NNCF/job/manual/job/post_training_quantization/730
- NNCF/job/manual/job/post_training_quantization_performance/119/
---------
Co-authored-by: dlyakhov <[email protected]>
0 commit comments