Skip to content

Make AWQ more general #2400

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 1, 2025
Merged

Make AWQ more general #2400

merged 1 commit into from
Aug 1, 2025

Conversation

jerryzh168
Copy link
Contributor

@jerryzh168 jerryzh168 commented Jun 18, 2025

Summary:

  • Added AWQConfig that takes a base config and made corresponding changes
    in other parts of the flow

Test Plan:
Tested on Phi4-mini and Qwen3-8B

Qwen3-8B

Task calibration_limit no-awq awq
leaderboard_math_hard (v3) 2 0.3543 0.4371
gpqa_main_zeroshot 50 0.32 0.36
mmlu 5 0.7372 0.7463
bbh 1 0.7385 0.7556

Phi4-mini

Task calibration_limit no-awq awq
mmlu_pro 2 0.4057 0.4757
gsm8k 5 0.72 0.76

Reviewers:

Subscribers:

Tasks:

Tags:

Copy link

pytorch-bot bot commented Jun 18, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2400

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 8b229a7 with merge base ffaf572 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 18, 2025
eps=eps,
)
else:
observer = AWQObserver2(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you not add kwargs to the AWQObserver and just check 'base_config' in kwargs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this is temporary, I think we can deprecate the old one in the end



@dataclass
class AWQConfig(AOBaseConfig):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok this is consolidating with quantize_ api's config based design?

Comment on lines +242 to +106
dummy_mod = DummyModule(observed_linear.weight * equalization_scale)
quant_mod = base_config_handler(dummy_mod, config.base_config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure whats happening here?. Isnt module already nn.Module?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is just trying to quantize the weight with the quantization type specified by config.base_config

if config.set_inductor_config:
torchao.quantization.utils.recommended_inductor_config_setter()

observed_linear = module
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is for linear only should you not assert that this nn.Linear? Plus how to you make sure this function is called only on nn.Linear?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that's true, will add an assert, we rely on user to use quantize_ correctly (it's through specifying the filter_fn arg in quantize_ API)

filter_fn: Optional[Callable[[torch.nn.Module, str], bool]] = None,

@jerryzh168 jerryzh168 added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Jun 24, 2025
@jerryzh168 jerryzh168 force-pushed the refactor-awq branch 2 times, most recently from 4d7eeb7 to 60f8852 Compare July 17, 2025 18:48
@jerryzh168 jerryzh168 changed the title [WIP] Make AWQ more general Make AWQ more general Jul 31, 2025
@jerryzh168 jerryzh168 requested a review from metascroy July 31, 2025 03:13
Copy link
Contributor

@metascroy metascroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look excellent!

@jerryzh168 jerryzh168 force-pushed the refactor-awq branch 5 times, most recently from 7c91a04 to 9074308 Compare August 1, 2025 18:26
Summary:
* Added AWQConfig that takes a base config and made corresponding changes
in other parts of the flow

Test Plan:
Tested on Phi4-mini and Qwen3-8B

Qwen3-8B
|Task | calibration_limit | no-awq | awq |
|-----+------------------+ ------+ ------+
|leaderboard_math_hard (v3) | 2 | 0.3543 | 0.4371 |
|gpqa_main_zeroshot | 50 | 0.32 | 0.36 |
|mmlu | 5 | 0.7372 | 0.7463 |
|bbh | 1 | 0.7385 | 0.7556|

Phi4-mini
| Task | calibration_limit | no-awq | awq |
|------+------------------+--------+------|
| mmlu_pro | 2 | 0.4057 | 0.4757 |
| gsm8k | 5 | 0.72 | 0.76 |

Reviewers:

Subscribers:

Tasks:

Tags:
@jerryzh168 jerryzh168 merged commit e6cb79a into pytorch:main Aug 1, 2025
20 checks passed
@jerryzh168
Copy link
Contributor Author

cc @Xia-Weiwen and @xiaowangintel we updated AWQ, please check if it still works for CPU and XPU

@Xia-Weiwen
Copy link
Collaborator

cc @Xia-Weiwen and @xiaowangintel we updated AWQ, please check if it still works for CPU and XPU

Thanks. Will check it out later.

Comment on lines +248 to +256
TransformerEvalWrapper(
model=model.to(device),
tokenizer=tokenizer,
max_seq_length=max_seq_length,
device=device,
).run_eval(
tasks=tasks,
limit=calibration_limit,
)
Copy link
Collaborator

@Xia-Weiwen Xia-Weiwen Aug 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jerryzh168 This part does not work with PPL on my side. It seems that we should use wiki2_eval for PPL. How was it when you ran?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh we can probably can remove PPL, I'm using lm-eval tasks to do evals

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Perhaps we need to update the default value of --tasks

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a preference for the default value, i.e. the default task?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, you can use hellaswag I think

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Thanks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jerryzh168 I have created a PR for CPU: #2688 Please take a look. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants