Skip to content

[quantization] Fallback for complex models#583

Merged
mhs4670go merged 1 commit intoSamsung:mainfrom
stamalakhov:vlm_gptq_pr
Mar 26, 2026
Merged

[quantization] Fallback for complex models#583
mhs4670go merged 1 commit intoSamsung:mainfrom
stamalakhov:vlm_gptq_pr

Conversation

@stamalakhov
Copy link
Copy Markdown
Contributor

This PR:

  1. quantizes the whole model (e.g. vlm) as a single layer in GPTQ
  2. adds example script for general vlm models
log of `python tico/quantization/wrapq/examples/quantize_full_vlm_model_with_gptq.py --model HuggingFaceTB/SmolVLM-256M-Instruct --nsamples_for_qcalibration 20 --eval_tasks vqav2 --gptq_mse mse`

Namespace(model='HuggingFaceTB/SmolVLM-256M-Instruct', device='cuda', dtype='float32', seed=42, trust_remote_code=False, hf_token=None, cache_dir='/mnt/storage/transformers_cache', nsamples_for_qcalibration=20, nsamples_for_evaluation=50, calib_seq_len=2048, max_seq_len=2048, linear_weight_bits=4, gptq_mse='mse', eval_tasks='vqav2', sensitivity_path=None)
=== Config ===
Model            : HuggingFaceTB/SmolVLM-256M-Instruct
Device           : cuda
DType            : float32
Calib seq len    : 2048
Max seq len      : 2048

Loading FP model …
Evaluating original model
Evaluating original model
vqav2: EM=0.5600  (n=50)
Original EM: 0.5600  (dataset=vqav2, n=50)
Resolving data files: 100%|█████████████████████████████████████████████████████████████| 68/68 [00:00<00:00, 7662.06it/s]
Resolving data files: 100%|█████████████████████████████████████████████████████████████| 36/36 [00:00<00:00, 6735.73it/s]
Resolving data files: 100%|██████████████████████████████████████████████████████████| 143/143 [00:00<00:00, 10294.45it/s]
Resolving data files: 100%|█████████████████████████████████████████████████████████████| 68/68 [00:00<00:00, 6965.07it/s]
Resolving data files: 100%|█████████████████████████████████████████████████████████████| 36/36 [00:00<00:00, 7293.38it/s]
Resolving data files: 100%|██████████████████████████████████████████████████████████| 143/143 [00:00<00:00, 13307.27it/s]
[info] Loaded dataset: lmms-lab/VQAv2 (testdev, streaming), size=20
Applying GPTQ …
Quantizing layers: 100%|████████████████████████████████████████████████████████████████| 1/1 [02:02<00:00, 122.53s/layer]
Evaluating quantized model                                                                                                
vqav2: EM=0.5600  (n=50)
|model|vqav2|
|--|--|
|original|0.5600|
|quantized|0.5600|

Please see other results in #559
Draft: #559
Related: #548

TICO-DCO-1.0-Signed-off-by: s.malakhov s.malakhov@partner.samsung.com

@stamalakhov stamalakhov self-assigned this Mar 25, 2026
This PR:
1. quantizes the whole model (e.g. vlm) as a single layer in GPTQ
2. adds example script for general vlm models

TICO-DCO-1.0-Signed-off-by: s.malakhov <s.malakhov@partner.samsung.com>
Copy link
Copy Markdown

@Torrero Torrero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

@mhs4670go mhs4670go left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mhs4670go mhs4670go merged commit 04b1927 into Samsung:main Mar 26, 2026
7 checks passed
@stamalakhov stamalakhov deleted the vlm_gptq_pr branch March 26, 2026 04:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants