-
Notifications
You must be signed in to change notification settings - Fork 192
Open
Labels
Description
Describe the bug
Following is the part of code that is causing issue here
The issue occurs because the amax that is set during the calibration step doesnt take into consideration block_sizes here
And when we try to compress it the previously calculated amax is passed as scales here
This results into following error
/usr/local/lib/python3.11/dist-packages/modelopt/torch/quantization/qtensor/fp8_tensor.py in quantize(cls, input, scales, axis, block_sizes)
99 expanded_scales = expanded_scales.reshape(expected_shape)
100
--> 101 assert scales.shape == tuple(expected_shape), (
102 f"Mismatch in expected scale shape: {scales.shape} vs {tuple(expected_shape)}"
103 )
AssertionError: Mismatch in expected scale shape: torch.Size([]) vs (1152, 18)
System information
nvidia_modelopt - 0.29.0