Skip to content
This repository was archived by the owner on Sep 10, 2025. It is now read-only.

Conversation

@mikekgfb
Copy link
Contributor

Implement the AO API in torchchat quantization handlers and unify logic.

1 - implement .quantize() for TC quantization handlers and support args to make consistent with AO
2 - remove special handling for various combinations of parameters and use validate_args before calling with **q_kwargs
3 - remove check probing whether we successfully loaded a8wx and install an error-reporting handler if loading failed which will be called as quant handler and issue an error
4 - unify both tc and ao quantization handler dicts with shared calling logic
5 - provide informational message when quantizer option not supported (via introspection)

Implement the AO API in torchchat quantization handlers and unify logic.

1 - implement .quantize() for TC quantization handlers and support args to make consistent with AO
2 - remove special handling for various combinations of parameters and use validate_args before calling with **q_kwargs
3 - remove check probing whether we successfully loaded a8wx and install an error-reporting handler if loading failed which will be called as quant handler and issue an error
4 - unify both tc and ao quantization handler dicts with shared calling logic
Added comment, and a missing self parameter
@pytorch-bot
Copy link

pytorch-bot bot commented Oct 10, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1291

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 6926b9c with merge base 95ebcb8 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 10, 2024
@Jack-Khuu Jack-Khuu added the Quantization Issues related to Quantization or torchao label Oct 12, 2024
Fix arg order (args with default after args w/o default)
torchao_experimental_quant_api_spec.loader.exec_module(torchao_experimental_quant_api)
from torchao_experimental_quant_api import Int8DynActIntxWeightQuantizer
ao_quantizer_class_dict["linear:a8wxdq"] = Int8DynActIntxWeightQuantizer
quantizer_class_dict["linear:a8wxdq"] = Int8DynActIntxWeightQuantizer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we can't put this inside of the quantizer_class_dict?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems cleaner to try import Int8DynActIntxWeightQuantizer first and fallback to ErrorHandler, then assign this handler to linear:a8wxdq in quantizer_class_dict.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems cleaner to try import Int8DynActIntxWeightQuantizer first and fallback to ErrorHandler, then assign this handler to linear:a8wxdq in quantizer_class_dict.

This is what the code is doing now. Try to import, and if the import fails, set up the error handler.

Or were you thinking to do a wrapper that imports the class Int8DynActIntxWeightQuantizer and then calls the error if it fails, and the imported method if the import succeeds? I assumed that all the conditional import will go away soonish since we should know what version of AO we pin, and whether it has the new class. (And that enablement is there, it doesn't disappear again, so that we can just do a simple import in the future.)

.... Because init can't return an alternate class, the wrapper would have to redispatch all methods internally, which isn't a big deal per se, but may add readability concerns? I'm happy to go either way, ideally as a follow-on.

LMK what you think the best long-term trajectory is for this functionality, and we'll align the code to that. (I think the current version is preferable to the previous state, b/c we don't have to special case in the dispatch loop)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if this can go away soon I don't have a strong opinion and can live with this.

Copy link
Contributor

@larryliu0820 larryliu0820 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the cleanup! Just one comment and please make sure CI jobs are passing

Fixed 2 typos.
@mikekgfb
Copy link
Contributor Author

Thanks for the cleanup! Just one comment and please make sure CI jobs are passing

@mikekgfb
Copy link
Contributor Author

Thanks for the cleanup! Just one comment and please make sure CI jobs are passing

My pleasure. Sorry about that error, should be fixed now.

Copy link
Contributor

@larryliu0820 larryliu0820 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI is passing

@Jack-Khuu Jack-Khuu merged commit 7d5ba09 into pytorch:main Oct 23, 2024
52 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

CLA Signed This label is managed by the Meta Open Source bot. Quantization Issues related to Quantization or torchao

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants