Make AWQ more general #2400

jerryzh168 · 2025-06-18T04:26:00Z

Summary:

Added AWQConfig that takes a base config and made corresponding changes
in other parts of the flow

Test Plan:
Tested on Phi4-mini and Qwen3-8B

Qwen3-8B

Task	calibration_limit	no-awq	awq
leaderboard_math_hard (v3)	2	0.3543	0.4371
gpqa_main_zeroshot	50	0.32	0.36
mmlu	5	0.7372	0.7463
bbh	1	0.7385	0.7556

Phi4-mini

Task	calibration_limit	no-awq	awq
mmlu_pro	2	0.4057	0.4757
gsm8k	5	0.72	0.76

Reviewers:

Subscribers:

Tasks:

Tags:

pytorch-bot · 2025-06-18T04:26:03Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2400

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 8b229a7 with merge base ffaf572 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

kimishpatel · 2025-06-18T19:36:17Z

torchao/prototype/awq/api.py

+                eps=eps,
+            )
+        else:
+            observer = AWQObserver2(


can you not add kwargs to the AWQObserver and just check 'base_config' in kwargs?

yes, this is temporary, I think we can deprecate the old one in the end

kimishpatel · 2025-06-18T19:36:58Z

torchao/prototype/awq/api.py

+
+
+@dataclass
+class AWQConfig(AOBaseConfig):


Ok this is consolidating with quantize_ api's config based design?

kimishpatel · 2025-06-18T20:03:35Z

torchao/prototype/awq/api.py

+    dummy_mod = DummyModule(observed_linear.weight * equalization_scale)
+    quant_mod = base_config_handler(dummy_mod, config.base_config)


I am not sure whats happening here?. Isnt module already nn.Module?

this is just trying to quantize the weight with the quantization type specified by config.base_config

kimishpatel · 2025-06-18T20:04:40Z

torchao/prototype/awq/api.py

+    if config.set_inductor_config:
+        torchao.quantization.utils.recommended_inductor_config_setter()
+
+    observed_linear = module


If this is for linear only should you not assert that this nn.Linear? Plus how to you make sure this function is called only on nn.Linear?

yeah that's true, will add an assert, we rely on user to use quantize_ correctly (it's through specifying the filter_fn arg in quantize_ API)

ao/torchao/quantization/quant_api.py

Line 578 in 4e3d019

filter_fn: Optional[Callable[[torch.nn.Module, str], bool]] = None,

torchao/prototype/awq/api.py

metascroy

Changes look excellent!

Summary: * Added AWQConfig that takes a base config and made corresponding changes in other parts of the flow Test Plan: Tested on Phi4-mini and Qwen3-8B Qwen3-8B |Task | calibration_limit | no-awq | awq | |-----+------------------+ ------+ ------+ |leaderboard_math_hard (v3) | 2 | 0.3543 | 0.4371 | |gpqa_main_zeroshot | 50 | 0.32 | 0.36 | |mmlu | 5 | 0.7372 | 0.7463 | |bbh | 1 | 0.7385 | 0.7556| Phi4-mini | Task | calibration_limit | no-awq | awq | |------+------------------+--------+------| | mmlu_pro | 2 | 0.4057 | 0.4757 | | gsm8k | 5 | 0.72 | 0.76 | Reviewers: Subscribers: Tasks: Tags:

jerryzh168 · 2025-08-02T01:07:23Z

cc @Xia-Weiwen and @xiaowangintel we updated AWQ, please check if it still works for CPU and XPU

Xia-Weiwen · 2025-08-02T14:31:44Z

cc @Xia-Weiwen and @xiaowangintel we updated AWQ, please check if it still works for CPU and XPU

Thanks. Will check it out later.

Xia-Weiwen · 2025-08-04T15:29:03Z

torchao/prototype/awq/example.py

+        TransformerEvalWrapper(
+            model=model.to(device),
+            tokenizer=tokenizer,
+            max_seq_length=max_seq_length,
+            device=device,
+        ).run_eval(
+            tasks=tasks,
+            limit=calibration_limit,
+        )


Hi @jerryzh168 This part does not work with PPL on my side. It seems that we should use wiki2_eval for PPL. How was it when you ran?

oh we can probably can remove PPL, I'm using lm-eval tasks to do evals

here is the list of tasks you can use: https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/tasks/README.md

OK. Perhaps we need to update the default value of --tasks

Do you have a preference for the default value, i.e. the default task?

I see, you can use hellaswag I think

Sure. Thanks.

Hi @jerryzh168 I have created a PR for CPU: #2688 Please take a look. Thanks.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 18, 2025

jerryzh168 mentioned this pull request Jun 18, 2025

[WIP] Add AWQ quantization with QDQLayout support for ExecuTorch #2399

Open

kimishpatel reviewed Jun 18, 2025

View reviewed changes

jerryzh168 force-pushed the refactor-awq branch from d682cb5 to 8b1fca1 Compare June 24, 2025 22:42

jerryzh168 added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Jun 24, 2025

jerryzh168 force-pushed the refactor-awq branch 2 times, most recently from 4d7eeb7 to 60f8852 Compare July 17, 2025 18:48

jerryzh168 mentioned this pull request Jul 25, 2025

[bc-breaking] Generalize QAT configs beyond intx quantization #2608

Closed

jerryzh168 force-pushed the refactor-awq branch from 60f8852 to 8afdb1f Compare July 31, 2025 03:08

jerryzh168 changed the title ~~[WIP] Make AWQ more general~~ Make AWQ more general Jul 31, 2025

jerryzh168 requested a review from metascroy July 31, 2025 03:13

metascroy reviewed Jul 31, 2025

View reviewed changes

torchao/prototype/awq/api.py Outdated Show resolved Hide resolved

metascroy approved these changes Jul 31, 2025

View reviewed changes

jerryzh168 force-pushed the refactor-awq branch 5 times, most recently from 7c91a04 to 9074308 Compare August 1, 2025 18:26

jerryzh168 force-pushed the refactor-awq branch from 9074308 to 8b229a7 Compare August 1, 2025 20:45

jerryzh168 merged commit e6cb79a into pytorch:main Aug 1, 2025
20 checks passed

jerryzh168 mentioned this pull request Aug 2, 2025

Convert SmoothQuant test to unittest #2659

Merged

Xia-Weiwen reviewed Aug 4, 2025

View reviewed changes

Xia-Weiwen mentioned this pull request Aug 5, 2025

[CPU] Fix AWQ on CPU after refactoring #2688

Open

MingxuZh mentioned this pull request Aug 8, 2025

[TorchAO] AWQ path fails after PR2400 with `ImportError: cannot import name 'awq_uintx' from 'torchao.prototype.awq' intel/torch-xpu-ops#1919

Open

This was referenced Aug 10, 2025

replace FbgemmConfig with Int4WeightOnlyConfig #2727

Closed

Make SmoothQuant more General #2728

Open

		dummy_mod = DummyModule(observed_linear.weight * equalization_scale)
		quant_mod = base_config_handler(dummy_mod, config.base_config)



		@dataclass
		class AWQConfig(AOBaseConfig):

Make AWQ more general #2400

Make AWQ more general #2400

Uh oh!

Conversation

jerryzh168 commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2400

✅ No Failures

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

metascroy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jerryzh168 commented Aug 2, 2025

Uh oh!

Xia-Weiwen commented Aug 2, 2025

Uh oh!

Xia-Weiwen Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jerryzh168 commented Jun 18, 2025 •

edited

Loading

pytorch-bot bot commented Jun 18, 2025 •

edited

Loading

Xia-Weiwen Aug 4, 2025 •

edited

Loading