Skip to content

[quantization] Suppport convolutions in SensitivityCalibrator#581

Merged
mhs4670go merged 1 commit intoSamsung:mainfrom
stamalakhov:sens_for_convs
Mar 25, 2026
Merged

[quantization] Suppport convolutions in SensitivityCalibrator#581
mhs4670go merged 1 commit intoSamsung:mainfrom
stamalakhov:sens_for_convs

Conversation

@stamalakhov
Copy link
Copy Markdown
Contributor

This PR:

  1. adds support for convolutions with related tests
  2. brings support for multiple inputs
  3. adds the option to hide progress to SensitivityCalibrator.
./ccex test --include-internal -k quantization.algorithm.test_gptq.GPTQTest

RUN unit tests with -k quantization.algorithm.test_gptq.GPTQTest ...
test_groupwise_conv1d (quantization.algorithm.test_gptq.GPTQTest) ... ok
test_groupwise_conv2d (quantization.algorithm.test_gptq.GPTQTest) ... ok
test_model (quantization.algorithm.test_gptq.GPTQTest) ... <frozen importlib._bootstrap>:241: DeprecationWarning: builtin type SwigPyPacked has no __module__ attribute
<frozen importlib._bootstrap>:241: DeprecationWarning: builtin type SwigPyObject has no __module__ attribute
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message.
ok
test_net (quantization.algorithm.test_gptq.GPTQTest) ... No specialized wrapper found for ModuleList; applying recursive wrapping.
ok
test_net_on_zero_inputs (quantization.algorithm.test_gptq.GPTQTest) ... ok
test_normconv1d (quantization.algorithm.test_gptq.GPTQTest) ... ok
test_normconv1d_with_logits (quantization.algorithm.test_gptq.GPTQTest) ... ok
test_normconv2d (quantization.algorithm.test_gptq.GPTQTest) ... ok
test_normconv2d_on_zero_inputs (quantization.algorithm.test_gptq.GPTQTest) ... ok
test_normconv2d_with_logits (quantization.algorithm.test_gptq.GPTQTest) ... ok
test_normconv3d (quantization.algorithm.test_gptq.GPTQTest) ... ok
test_normconv3d_on_zero_inputs (quantization.algorithm.test_gptq.GPTQTest) ... ok
test_normconv3d_with_logits (quantization.algorithm.test_gptq.GPTQTest) ... ok
test_paddednormconv2d (quantization.algorithm.test_gptq.GPTQTest) ... ok
test_paddednormconv3d (quantization.algorithm.test_gptq.GPTQTest) ... ok
test_transposed_conv2d (quantization.algorithm.test_gptq.GPTQTest) ... ok
test_transposed_conv2d_with_logits (quantization.algorithm.test_gptq.GPTQTest) ... ok

----------------------------------------------------------------------
Ran 17 tests in 61.068s

OK

Draft: #559
Related: #548

TICO-DCO-1.0-Signed-off-by: s.malakhov s.malakhov@partner.samsung.com

@stamalakhov stamalakhov self-assigned this Mar 25, 2026
if show_progress is True:
print("Computing calibration set")
for prompt in tqdm.tqdm(dataset, disable=not show_progress):
if isinstance(prompt, torch.Tensor):
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's process multiple inputs as well.

Comment on lines +136 to +141
self.calibrated_types = [
torch.nn.Linear,
torch.nn.Conv2d,
torch.nn.Conv1d,
torch.nn.Conv3d,
torch.nn.ConvTranspose2d,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calibrate also convolutions.

inp_ids = inputs.view(-1, inputs.shape[-1])
logits = model(inp_ids.to(model.device)).logits
if isinstance(inputs, torch.Tensor):
inp_ids = inputs.squeeze(0) # remove redundant batch dimension
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove redundant batch dimension, instead of reshaping input to 2D shape.

Comment on lines +173 to +176
for item in inputs:
inputs[item] = inputs[item].to(model.device).squeeze(0)

logits = model(**inputs).logits
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same as above, but for multiple inputs.

This PR:
1. adds support for convolutions  with related tests
2. brings support for multiple inputs
3. adds the option to hide progress
to SensitivityCalibrator.

TICO-DCO-1.0-Signed-off-by: s.malakhov <s.malakhov@partner.samsung.com>
@stamalakhov stamalakhov requested a review from a team March 25, 2026 07:29
for type in self.calibrated_types:
if isinstance(module, type):
modules_to_process[name] = module
name_of_module[module] = name
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not related with this PR but name_of_module uses nowhere.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh. You're right! Thank you! I'll remove it.

Copy link
Copy Markdown
Contributor

@mhs4670go mhs4670go left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mhs4670go mhs4670go merged commit 976b8a9 into Samsung:main Mar 25, 2026
7 checks passed
@stamalakhov stamalakhov deleted the sens_for_convs branch March 25, 2026 09:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants