Skip to content

[quantization] Introduce wrapper for Qwen3VLTextDecoderLayer#566

Merged
mhs4670go merged 1 commit intoSamsung:mainfrom
dvsav:quant_text_decoder_layer
Mar 24, 2026
Merged

[quantization] Introduce wrapper for Qwen3VLTextDecoderLayer#566
mhs4670go merged 1 commit intoSamsung:mainfrom
dvsav:quant_text_decoder_layer

Conversation

@dvsav
Copy link
Copy Markdown
Contributor

@dvsav dvsav commented Mar 19, 2026

This change introduces QuantQwen3VLTextDecoderLayer wrapper to support post-training quantization of Qwen3VLTextDecoderLayer module.

Why?

Qwen3VLTextDecoderLayer is an essential part of Qwen model.
Trying to quantize Qwen3VLTextDecoderLayer via PTQ generates exception PTQQuantizer: no quantization wrapper for Qwen3VLTextDecoderLayer.

What

This change introduces:

  • Class QuantQwen3VLTextDecoderLayer (tico/quantization/wrapq/wrappers/qwen_vl/quant_text_decoder_layer.py).
  • Unit tests: class TestQuantQwen3VLTextDecoderLayer (test/quantization/wrapq/wrappers/qwen_vl/test_quant_text_decoder_layer.py) - skipped if transformers package is not installed.
  • New entry in _CORE_MODULES (tico/quantization/wrapq/wrappers/registry.py).
  • Example of Qwen3VLTextDecoderLayer quantization and conversion to Circle (tico/quantization/wrapq/examples/qwen/quantize_text_decoder_layer.py).

Unit Tests

Unit tests results with coverage information:

$ coverage run -m pytest test/quantization/wrapq/wrappers/qwen_vl/test_quant_text_decoder_layer.py -v
================================================================== test session starts ===================================================================
platform linux -- Python 3.10.12, pytest-8.4.0, pluggy-1.6.0 -- /home/d.savchenkov/myenv/bin/python3
cachedir: .pytest_cache
rootdir: /home/d.savchenkov/TICO
configfile: pyproject.toml
plugins: anyio-4.12.0, mock-3.15.1, xdist-3.7.0, cov-6.2.1
collected 8 items

test/quantization/wrapq/wrappers/qwen_vl/test_quant_text_decoder_layer.py::TestQuantQwen3VLTextDecoderLayer::test_different_batch_sizes            PASSED [ 12%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_text_decoder_layer.py::TestQuantQwen3VLTextDecoderLayer::test_forward_diff                     PASSED [ 25%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_text_decoder_layer.py::TestQuantQwen3VLTextDecoderLayer::test_mode_transitions                 PASSED [ 37%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_text_decoder_layer.py::TestQuantQwen3VLTextDecoderLayer::test_observer_count                   PASSED [ 50%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_text_decoder_layer.py::TestQuantQwen3VLTextDecoderLayer::test_output_shape                     PASSED [ 62%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_text_decoder_layer.py::TestQuantQwen3VLTextDecoderLayer::test_per_module_override              PASSED [ 75%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_text_decoder_layer.py::TestQuantQwen3VLTextDecoderLayer::test_registration_in_registry         PASSED [ 87%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_text_decoder_layer.py::TestQuantQwen3VLTextDecoderLayer::test_residual_connection_preservation PASSED [100%]

======================================================== 8 passed, 2 warnings in 80.06s (0:01:20) ========================================================

Coverage info (irrelevant files skipped):

$ coverage report -m

Name                                                                    Stmts   Miss  Cover   Missing
-----------------------------------------------------------------------------------------------------
...
tico/quantization/wrapq/wrappers/qwen_vl/quant_text_decoder_layer.py       42      0   100%
...
-----------------------------------------------------------------------------------------------------
TOTAL                                                                   11272   7196    36%

Script for testing quantization and conversion to Circle

$ python tico/quantization/wrapq/examples/qwen/quantize_text_decoder_layer.py
┌───────────── Quantization Error Summary ─────────────
│ Mean |diff|: 0.018236
│ PEIR       : 1.069503 %
└──────────────────────────────────────────────────────
    ┌────────────────────────────────────────────┐
 4.5┤                                            │
    │                                         •  │
    │                                      •••   │
 3.0┤                                    •••     │
    │                                  •••       │
    │                                •••         │
    │                             ••••           │
 1.5┤                           ••••             │
    │                         ••••               │
    │                       ••••                 │
 0.0┤                     ••••                   │
    │                   ••••                     │
    │                 ••••                       │
    │               ••••                         │
-1.5┤             ••••                           │
    │           ••••                             │
    │         •••                                │
-3.0┤       •••                                  │
    │     •••                                    │
    │   •••                                      │
    │  •                                         │
-4.5┤                                            │
    └┬──────────┬──────────┬─────────┬──────────┬┘
   -4.5       -2.3        0.0       2.3       4.5 

Circle model saved as 'qwen3vl_text_decoder_layer.q.circle'

@dvsav dvsav force-pushed the quant_text_decoder_layer branch from 613c233 to be903b6 Compare March 19, 2026 13:10
@dvsav dvsav force-pushed the quant_text_decoder_layer branch from be903b6 to cae5c77 Compare March 19, 2026 13:22
@dvsav dvsav marked this pull request as ready for review March 19, 2026 13:32
@dvsav dvsav force-pushed the quant_text_decoder_layer branch 2 times, most recently from 9795d38 to be93970 Compare March 23, 2026 06:53
@mhs4670go
Copy link
Copy Markdown
Contributor

@dvsav Seems CI test failed.

This change introduces QuantQwen3VLTextDecoderLayer wrapper to support post-training quantization of Qwen3VLTextDecoderLayer operation.

TICO-DCO-1.0-Signed-off-by: d.savchenkov <d.savchenkov@partner.samsung.com>
@dvsav
Copy link
Copy Markdown
Contributor Author

dvsav commented Mar 24, 2026

@dvsav Seems CI test failed.

Hi @mhs4670go,
Yes, indeed. I've fixed it now.

# Convert to quantized version
quantized_model = tico.quantization.convert(prepared_model, inplace=True)

# Compute PEIR (Peak Error-to-Input Ratio) between quantized model and original model
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Peak Error to Interval ratio. Seems other codes have mistakes. Let's fix them at once in another PR.

Copy link
Copy Markdown
Contributor

@mhs4670go mhs4670go left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mhs4670go mhs4670go merged commit 8a8e7d8 into Samsung:main Mar 24, 2026
7 checks passed
@dvsav dvsav deleted the quant_text_decoder_layer branch March 25, 2026 06:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants